Language Log
University commas
The current xkcd comic:
http://languagelog.ldc.upenn.edu/myl/university_commas_2x.png
Mouseover title: "The distinctive 'UCLA comma' and 'Michigan comma' are a long string of commas at the start and end of the sentence respectively."
I guess Penn, Brown, Berkeley, CalTech, …, should be grateful for being left out.
I'll spare you our past posts on the Oxford comma, except this one.
➖ @EngSkills ➖
Word of the Day
Word of the Day: debacle
This word has appeared in 259 articles on NYTimes.com in the past year. Can you use it in a sentence?
➖ @EngSkills ➖
Phrasal Verb of the Day | Vocabulary | EnglishClub
hang out (1)
to hang wet clothes outside to dry
➖ @EngSkills ➖
Word of the Day
orifice
Definition: (noun) An opening, especially to a cavity or passage of the body; a mouth or vent.
Synonyms: opening, porta.
Usage: The nose was but a gaping orifice above a deformed and twisted mouth.
Discuss
➖ @EngSkills ➖
Funny Or Die (Youtube)
TOMORROW we welcome Derek Waters Inside the Funny or Die Vault
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Language Log
Gyro, part 3
"Turkey’s döner kebab spat with Germany is turning nasty", by Daniel Thorpe, The Spectator (10/5/24)
Last April, German president Frank-Walter Steinmeier decided to bring along a 60-kilogram döner kebab on his state visit to Turkey. It did not go down well. Turks found the stunt condescending; Germans were mortified. Ankara lodged an official request with the European Commission to make the dish a ‘traditional speciality’, thereby regulating what can be sold under the name ‘döner’ in Europe.
It's not just a culinary matter — it's political, cultural, and technical:
Though seemingly rather mundane, the latest disagreement over the classification of döner kebab indicates there is little love lost between the two capitals. Turkey aims to standardise and dictate through the European Commission what can be sold as döner kebab, breaking it down to the finest details, such as the meat composition, pH value, salt content, and the thickness of the slices that come off. German politicians and business owners, including those of Turkish background, are not happy.
I don't blame them — döner kebab has become a central element in German culture, especially among youth, but because of the garlic sauce and raw onions, you'd better be careful about when you eat it:
Today, the döner kebab is the most popular fast food in Germany, even more than the godforsaken currywurst. A German village so small that it does not even have a pub might still have a döner kebab eatery. It was introduced by the Turkish migrant workers in the 1960s and 1970s. As often happens with dishes cooked far from their motherlands, the döner kebab started to be prepared in ways different from the ‘original’, catering to local tastes with the available ingredients.
Customers are not too concerned about shops having to sell the product under different names, such as Greek gyros or Arabic shawarma, What really grates is the rising cost of a döner kebab:
‘A few years ago, the price of a döner kebab was around four euros. Now you pay up to eight-thirteen euros,’ [Niko Schmitz] laments. Schmitz is not alone. ‘I’m paying eight euros for a döner,’ a protestor shouted at chancellor Olaf Sholz in 2022. ‘Speak with Putin, please. I want to pay four euros for a döner.’
A brief note on the history of the mouth-watering snack:
The döner itself can be traced back to the early 19th century Ottoman Empire, when someone had the ingenious idea of flipping the existing horizontal stack of marinated meat on an iron rod vertically. Turning the rotisserie upright not only saved much of the juices and fat from dripping into the fire but also rendered it more suitable for urban spaces.
Where is all this headed? Over such contentious issues as those being argued about by the German and Turkish governments, the product may branch into different varieties, and the plethora of names for them will undoubtedly continue to proliferate. Selected readings
* "Gyro" (6/26/20)
* "Gyro, part 2" (9/28/24)
* "Teen attacked by kebab van" (9/5/12)
* "Nontrivial script fail" (5/18/11) — 7th comment
* "'Ingenious herd of charcoal fire'" (4/5/11)
* "Why Do Canadians Eat Donair?" (4/13/07)
* "If you're uneducated you say it right" (2/2/09) — in the comments
* "Ajvar and caviar" (8/1/22)
* "Respect the local pronunciation: runza and Henri" (6/13/24)
[Thanks to Mark Metcalf]
➖ @EngSkills ➖
Slang of the Day | Vocabulary | EnglishClub
git
a fool, a stupid person
➖ @EngSkills ➖
Idiom of the Day
a shiver down (one's) spine
A shudder felt down one's back, due to either fear, anticipation, nervousness, or excitement. Watch the video
➖ @EngSkills ➖
Funny Or Die (Youtube)
Early Comedy Inspirations Or Dream Blunt Rotation? (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Funny Or Die (Youtube)
Beth Stelling Says To Just Have Fun (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Funny Or Die (Youtube)
Hecklers are no match for Beth, especially with Sarah McLachlan. (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Word of the Day
impish
Definition: (adjective) Naughtily or annoyingly playful.
Synonyms: arch, mischievous, pixilated, prankish, puckish, wicked.
Usage: These remarks were greeted with shouts of laughter by the impish creatures and one seized the Scarecrow's arm and was astonished to find the straw man whirl around so easily.
Discuss
➖ @EngSkills ➖
are aligned to form an amplified and elaborated evaluative scale of 13 levels, the E(xpanded) GIDS. Any known language, including those languages for which there are no longer speakers, can be categorized by using the resulting scale (unlike the GIDS). A language can be evaluated in terms of the EGIDS by answering five key questions regarding the identity function, vehicularity, state of intergenerational language transmission, literacy acquisition status, and a societal profile of generational language use. With only minor modification the EGIDS can also be applied to languages which are being revitalized.
Here's the EGIDS table from the Wikipedia article: http://languagelog.ldc.upenn.edu/myl/WikipediaEGIDS.png The current edition of Ethnologue offers a "Language Cloud" for each language, consisting of a scatter plot whose x-axis is the EGIDS level, and whose y-axis is the estimated number of speakers. Here's the Language Cloud for Bajjika: http://languagelog.ldc.upenn.edu/myl/BajjikaLanguageCloud.png Ethnologue's explanation of their dot colors:
* Purple = Institutional (EGIDS 0-4) — The language has been developed to the point that it is used and sustained by institutions beyond the home and community.
* Blue = Developing (EGIDS 5) — The language is in vigorous use, with literature in a standardized form being used by some though this is not yet widespread or sustainable.
* Green = Vigorous (EGIDS 6a) — The language is unstandardized and in vigorous use among all generations.
* Yellow = In trouble (EGIDS 6b-7) — Intergenerational transmission is in the process of being broken, but the child-bearing generation can still use the language so it is possible that revitalization efforts could restore transmission of the language in the home.
* Red = Dying (EGIDS 8a-9) — The only fluent users (if any) are older than child-bearing age, so it is too late to restore natural intergenerational transmission through the home; a mechanism outside the home would need to be developed.
* Black = Extinct (EGIDS 10) — The language has fallen completely out of use and no one retains a sense of ethnic identity associated with the language.
Ethnologue's Language Cloud for Duruwa uses a yellow dot: http://languagelog.ldc.upenn.edu/myl/DuruwaLanguageCloud.png I believe that Google's work on under-documented languages of India has been led by Partha Talukdar, whose LinkedIn page says "I lead the Languages group at Google DeepMind, India focusing on making LLMs work well for speakers of more number of languages. The goal is to make sure benefits of AI are available to a broader population where language is not a barrier anymore."
One of his group's relevant contributions is "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages":
As large language models (LLMs) see increasing adoption across the globe, it is imperative for LLMs to be representative of the linguistic diversity of the world. India is a linguistically diverse country of 1.4 Billion people. To facilitate research on multilingual LLM evaluation, we release IndicGenBench – the largest benchmark for evaluating LLMs on user-facing generation tasks across a diverse set 29 of Indic languages covering 13 scripts and 4 language families. IndicGenBench is composed of diverse generation tasks like cross-lingual summarization, machine translation, and cross-lingual question answering. IndicGenBench extends existing benchmarks to many Indic languages through human curation providing multi-way parallel evaluation data for many under-represented Indic languages for the first time. We evaluate a wide range of proprietary and open-source LLMs including GPT-3.5, GPT-4, PaLM-2, mT5, Gemma, BLOOM and LLaMA on IndicGenBench in a variety of settings. The largest PaLM-2 models performs the best on most tasks, however, there is a significant performance gap in all languages compared to English showing that further research is needed for the development of more inclusive multilingual language[...]
Slang of the Day | Vocabulary | EnglishClub
ditch
to end a relationship with someone
➖ @EngSkills ➖
Idiom of the Day
a shame
An unfortunate situation. The term is used either in consolation or ironically. Watch the video
➖ @EngSkills ➖
Language Log
Doing well
The current Dinosaur Comics:
http://languagelog.ldc.upenn.edu/myl/DinosaurDo.png
Mouseover title: "why do other verbs when "do" does do all you did do are doing or can and will do??"
Of course do already does most of what Ryan North wants — Wiktionary gives it 31 senses, from (1) "A syntactic marker in a question whose main verb is not another auxiliary verb or be" (Do you got there often?) to (31) "To drive a vehicle at a certain speed, especially in regard to a speed limit" (He was doing 50 in a school zone). Along the way we get (29) "To take drugs" (I do cocaine), which is not far from Ryan's "do beers tonight" — and for that, there's already a t-shirt:
http://languagelog.ldc.upenn.edu/myl/JustDoBeers.webp
Do's utility has been around for a while, judging by the OED's recital of Germanic cognates and further-out IE connections:
http://languagelog.ldc.upenn.edu/myl/OED_do_etymology.png
Other languages have taken a different path in choosing an everything verb, for instance starting with "make" rather than "put" (French faire / Spanish haver), resulting in a somewhat different the semantic spread.
Commenters will no doubt be able to fill us in on what other lexical seeds have similarly sprouted in other languages.
➖ @EngSkills ➖
Slang of the Day | Vocabulary | EnglishClub
zonked | zonked out (1)
under the influence of drugs or alcohol
➖ @EngSkills ➖
Idiom of the Day
bottom of the ninth
The final and critical moment or moments of a tense, important, or desperate situation. It refers to the ninth inning of baseball, the "bottom" of which is batted by the home team as their last chance to win the game. Primarily heard in US, South Africa. Watch the video
➖ @EngSkills ➖
Language Log
PIE *g’enH1 and *gʷenH2 as cognates ("king" and "queen")
[This is a guest post by German Dziebel, commenting on "PIE *gene- *gwen-" (8/10/23).]
I will strike a dissenting note here. The two roots in question – *g’enH1 and *gʷenH2 are likely cognates. There seems to be a non-random distribution of palatalized and labialized velars in IE stems with nasals – palatovelars are favored in stems with m, while labiovelars are favored in stems with n. E.g.,
nGʷ roots: *nogʷno- 'naked', *nogʷt- 'night', *snoigʷho- 'snow', *h₂ongʷo- 'anoint', *h1ngwni- 'fire', *negʷhro- 'kidney', *gʷenh₂ 'wife', *kʷoino- 'price', *penkʷe- '5', *h₁lengʷʰ- 'light', *gʷʰen- 'slay, strike', *sengʷh- 'sing', *neigʷ- 'wash'
vs.
mG'-roots: *H3moiǵhlo- (assimilated to njegull(ë) in Gheg Alb), *meǵh₂s 'great', *meh₂ǵ- 'smear, anoint', *ǵheyōm 'winter', *dheǵhōm 'earth', *ḱoimo- 'household, family', *mreǵh-, *mosgho- 'brain', *h₂melǵ- 'milk', *smeḱur 'chin, beard', *deḱm̥ '10', *h1ḱm̥tóm '100' *h₂émǵʰu- 'narrow' (Hitt hamenk- 'tie, bind').
Although there are seeming exceptions (e.g., PIE *gʷher- ‘hot’ yields -mo-derivatives in Gk θερμός, Alb zjarm, Arm jerm, in all those branches the labiovelar is found in a palatalized state), those exceptions are limited in number and can be explained as later assimilations. This is likely what happened with PIE *g’enH1 and PIE *gʷenH2 where only *gʷenH2 is “legal”, while *g’enH1 is likely assimilated from either *g’emH1 or *gʷenH1. As a supporting proof for this inference one can cite Baltic *gmti ‘beget, give birth’ (Lith gimti, Latv dzimt, OPruss gemton) that must be going back to *gʷem- (no connection to PIE *gʷen- ‘come, step’ (Lat venio:, Gk baino:, etc., with assimilation creating stems such as Germ *kwemaną (comp. *faima 'foam' < PIE *spoineh₂), PToch *kum (comp. mekwa ' nails' < *nogʷho-) and InIr *ǰámati (comp. Skrt ūrmí, Avest varəmi 'wave' but Lith vilnis, Slav *vъlna 'wave')). PIE *gʷem- went through assimilation and generalized labiality across the stem in exactly the opposite way from PIE *g’enH1 that generalized palatality. As a sum total, it’s most likely that the PIE word for ‘beget, give birth’ was * gʷen(H1)- and hence it can hardly be separated from *gʷenH2 ‘woman, wife’. Germ *kʷēniz 'wife' was likely applied to ‘queen’, too, as in Old English, and was a cognate counterpart to *kuninga- ‘king’. It’s to be expected that the words for ‘king’ and ‘queen’ were derived from a single root as they do in so many IE languages – living and dead – from Hitt hassu ‘king’, hassusara ‘queen’ onward.
➖ @EngSkills ➖
Funny Or Die (Youtube)
Comedy's Double Standard: Beth Stelling on Why Women Aren't Afforded The Opportunity to Fail
Surprise! Comedy has a double standard.
Get all 10 episodes of season 1 now, and stay in touch for new episodes, news, and show extras: https://norby.link/ctdAJD
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie"
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Word of the Day
Word of the Day: initiative
This word has appeared in 1,445 articles on NYTimes.com in the past year. Can you use it in a sentence?
➖ @EngSkills ➖
Phrasal Verb of the Day | Vocabulary | EnglishClub
fall off
to become less in amount or lower in level
➖ @EngSkills ➖
Funny Or Die (Youtube)
Bad vibes all the way. (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Funny Or Die (Youtube)
This Part Of Comedy Is Beth Stelling's Worst Nightmare (Inside The FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Funny Or Die (Youtube)
Crowdwork Has Overtaken Everything (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Funny Or Die (Youtube)
Spoiler alert: Comedians don’t serve up a new routine nightly (Inside the FOD Vault Episode 1)
Subscribe now: https://www.youtube.com/c/funnyordie?sub_confirmation=1
Get more Funny Or Die
-------------------------------
Facebook: https://www.facebook.com/funnyordie
Twitter: https://twitter.com/funnyordie
Instagram: http://instagram.com/funnyordie
TikTok: funnyordie" rel="nofollow">https://www.tiktok.com/@funnyordie
➖ @EngSkills ➖
Language Log
"Lost" languages?
The use of the word lost in this recent story caught my attention — Pankaj Doval, "Google set to revive lost Indian languages", The Times of India 10/3/2024:
As it gets deeper into India with generative AI platform Gemini and other suite of digital offerings, Google has taken up a new task in hand – reviving some of the lost Indian languages and creating digital records and online footprint for them.
I'll say more later about Google's important and interesting contribution to an important and interesting problem. But first, what does the article mean by "lost Indian languages"? I started from the idea that languages that are "lost" are extinct, i.e. no longer spoken — and a web search for the phrase "lost languages" confirms that others have the same interpretation.
However, the Times of India article makes it clear that this is not what they mean:
The idea is to enable people to easily carry out voice or text searches in their local dialects and languages.
As the work moves towards completion, people in the hinterland and various regions can easily do voice search in their own languages to gain accurate and valuable information from, say, Google's Gemini AI platform or carry out live translations, harness YouTube better to target their communities.
The project has so far reached 59 Indian languages, including 15 that currently do not have any kind of a digital footprint and were rather declining in usage.
The project has so far reached 59 Indian languages, including 15 that currently do not have any kind of a digital footprint and were rather declining in usage.
And the article includes this graphic, listing 8 of those 59 languages: http://languagelog.ldc.upenn.edu/myl/GoogleIndianLanguages.webp Looking these languages up on Ethnologue and Wikipedia tells us that some of them have as many as 10 to 20 million speakers (details below), so they're far from extinct. And it's misleading to say that they "have been recorded digitally by Google for first time" — for example, the Wikipedia article for Bajjika says that "Lakshmi Elthin Hammar Angna (2009) was the first formal feature film in Bajjika. Sajan Aiha Doli le ke came after that". And YouTube has quite a few items partly or entirely in Bajjika, including a Bajjika Channel.
Of course, some of the cited languages are smaller — thus Wikipedia says that Duruwa has 18,151 speakers, while Ethnologue give the number as 12,000. And the motivation for the Google project is that all of these languages are (or were) "under documented" or "under resourced", in the sense that they lack the digital resources needed for robust modern language technologies such as speech-to-text, text-to-speech, text understanding, and so on. And there's a general concern that this situation makes language potentially "endangered" and thus at risk of being lost.
It's possible that Indian English generally uses "lost language" in this sense, though I'm guessing that the author of the article (or someone else in the editorial chain) made the choice.
Anyhow, it's worth spending a few minutes on a widely-used attempt to clarify the relevant terminology — the Expanded Graded Intergenerational Disruption Scale (EGIDS), originally proposed in M. Paul Lewis and Gary Simons, "Assessing endangerment: expanding Fishman’s GIDShttp://ipv6.lingv.ro/RRL%202%202010%20art01Lewis.pdf" (2010):
ABSTRACT: Fishman’s 8-level Graded Intergenerational Disruption Scale (GIDS) has served as the seminal and best-known evaluative framework of language endangerment for nearly two decades. It has provided the theoretical underpinnings for most practitioners of language revitalization. More recently, UNESCO has developed a 6-level scale of endangerment. Ethnologue uses yet another set of five categories to characterize language vitality. In this paper, these three evaluative systems [...]
Phrasal Verb of the Day | Vocabulary | EnglishClub
round down
If you round a number or an amount down to a certain level such as a whole number or the nearest dollar, you bring it down to that level.
➖ @EngSkills ➖