Talk:SAMPA
From Wikipedia, the free encyclopedia
Note:
SAMPA is now deprecated for use in Wikipedia. Please use the IPA when possible. To make the IPA viewable on Internet Explorer, enclose it with {{IPA|...}}
Contents |
[edit] Old discussion
I am not 100% sure, but it appears as though many of the English examples, especially the vowels, are more typical of British English than of American English. Could we have some American English examples?
does [ Q ] occur in US English? I'm guessing the following transcipritons for US
- law : [ lA: ]
- dot : [ dA:t ]
- cat: I'm desperately trying out other British English accents which pronounce "cat" differently and drawing a blank. tending towards [ c(I){t ] maybe? would anyone care to enlighten me?
- It's can be cardinal [a] in the north of England. There's some social variability in educated southern speech; it may tend to [E] for some. Gritchka
- fun: no idea... [ f3:n ] ?
-- Tarquin, Tuesday, June 18, 2002
- "Law" will vary depending on whether you're transcribing pronunciation from an area where the local dialect merges "cot" and "caught" or not. Here where the merger is common (generally speaking, west of the Mississippi), [lA:] sounds right. I'm not so sure about [dA:t] vs [dAt], but the length doesn't really matter here; it's certainly not [Q] unless I'm sorely misunderstanding what [Q] is. "Fun" is only [fVn] as far as I know, and I'm even more confused by the "London English" "cat" issue.
- Also, you may or may not be interested in the SAMPA-IPA table I've put in the Esperanto version of this article, based on the IPA table I snagged from this German page. --Brion VIBBER
The table would be good as an addition to the current one. Most of the IPA symbols render as "?" on Win98 :(
- (That is my one reservation about it; IPA characters aren't in the most common fonts. If you have Arial Unicode or Lucida Sans Unicode installed, they should work, but these are only common with more recent versions of MS Windows and/or Office. --Brion VIBBER)
About Q and O... (Musn't get muddled translating from IPA to SAMPA...) My Collins says cot [ kQt ] (upside down round a) and caught [ kO:t ] (upside down c). I'm trying to hear in my head how characters on US TV shows speak... I'm an RP speaker, so [ O ] occurs in "bored", "law", "order". [ Q ] is in "poppy" and words ending in "ology", [ Ql@d3I ]. Unfortunately, I think US pronunciation will be different for all. I'll listen out next time I watch a US tv show and see if I can spot a [ Q ]. For that matter, Dr Corday on ER has an RP accent. Tarquin, Tuesday, June 18, 2002
- The only dictionaries I have that give IPA for English pronunciations are bilingual dictionaries, which all seem to give RP. American English dictionaries usually use idiosyncratic symbol sets and don't bother giving IPA keys, but simply give words as examples. This works well enough if you're trying to play the least common factor for a lot of dialects and provide clues for a few previously unknown words, but is very frustrating to someone who wants an actual reference on how this or that phoneme gets pronounced -- if you don't already know, you're not going to get anything out of it!
-
- As accents keep changing it's hard to give a reference set even for an abstraction like Standard British or General American. The o-merger in American, and the shift of S. Br. 'cut' from [kVt] to [kat] are big changes. It's one reason old-fashioned RP has been used as reference for so long. Gritchka
- I'm trying to think of some potential Q/A minimal pairs... Hmm: bomb/balm, wok/walk, tock/talk, hock/hawk. (After checking a dictionary, I guess the latter three won't work for you as they've got [O:] in RP.) Further, it's long [A:] (for "balm") but short [Q] according to those RP dics; I'm not sure I can reliably produce or detect a distinction in either quality or quantity, but neither am I 100% convinced that there's not a subtle difference in my pronunciation that vanishes when I deliberately look for it. Rather annoying! --Brion VIBBER
I am an American, from Connecticut, and I believe that in my area, the sound of "o" in words such as "stop", "fox", etc., is just about the same as the vowel "a" as it is pronounced in Spanish, etc. Here, there is no difference at all between the sound that primary school teachers call "short o" and the sound you make when a doctor tells you to "say ah". -User:Juuitchan
While I'm at it, vowel length around here (Connecticut) seems to be pretty much irrelevant. I am not sure, but I think in the local speech, there are ten vowel sounds, not counting diphthongs or "r-colorings", in addition to "vocalic r". I will enumerate:
- "a" of "wasp" or "o" of "fox"
- "a" of "cat"
- "i" of "dish"
- "ee" of "feet"
- "u" of "glue" or of "rude"
- "u" of "but"
- "oo" of "foot" or "ou" of "should"
- "a" of "ball" or "aw" of "law" or of "caw"
- "e" of "let" or "ea" of "head"
- "o" of "old" or of "snow"
- Vocalic r: "ir" of "bird" or "er" of "her"
Some things I have noticed among students around here learning Japanese:
- 1. I myself have trouble perceiving vowel length.
- 2. Around here, the pure "o" vowel occurs only in diphthongs, except when followed by a liquid. This caused some students to mispronounce the Japanese "o".
hmmm... that maps pretty well to what we have in MN:
- /i/ - "seat"
- /I/ - "sit"
- /ej/ - "sate", with predictable allophone [e] before /r/ e.g. "share"
- /E/ - "set"
- /&/ - "sat"
- /a/ - "sod"
- /A/ - "salt"
- /ow/ - "local", "sowed", again with allophone [o] before /r/ e.g. "soar"
- /u/ - "Luke", "sued"
- /U/ - "look", "soot"
Though there is some question in my mind as to the independence of /A/ from /a/; the names "Dawn" and "Don", while nominally /dAn/ and /dan/, are considered homophonous. It seems to me that /A/ primarily occurs where orthographic <a> precedes <l>+consonant or <w>, which begs the question of whether we have a case of /alC/ becoming first [AlC] and then deleting the /l/ to become /AC/ (with /aw/ undergoing a similar process), i.e. a allophonic shift rather than a true phonemic distinction (though it obviously has possibilities of becoming fully phonemic). Pgdudda
- Hmm, what do you have for "sud"? --Brion VIBBER
-
- Me, I have /s^d/ for "sud"; I know in New Zealand it's /s^:d/ (the length is quite noticeable to my American ears - it's a pretty bizarre vowel). BTW, I'm looking forward to hearing the sound files mentioned below. :-) Pgdudda
"bomb", "balm" and "walk" have 3 different vowel sounds for me, [Q], [A:], [O]. "balm" rhymes with "charm", which I imagine it doesn't for you. I've just signed up to wikipedia-L so the easiest thing might be to post a soundfile there - does it take attachments? -- Tarquin, Wednesday, June 19, 2002
- Aw heck, just upload them and include them in the article as samples! :) --Brion VIBBER
-
- which sound format would be best? RA, MP3 (proprietory), WAV (fat!)... OGG is ideologically the best choice, but I don't know how common players for it are yet. Tarquin
-
-
- Definitely not the Satan-spawned RealAudio! Very short WAVs (single words at low sample rates) should be okay, but a compressed format is nicer. We currently have a few MP3s up for Pater Noster samples of a few languages and haven't been sued yet... If you'd prefer OGG, feel free. There are players or plugins available for most major platforms, just include a note with a link to Vorbis; people should be able to get to links to players from there. --Brion VIBBER
-
- I'm imagining those three words in RP... *giggle* they sure sound different! For me, "bomb" is /bam/, "balm" is /bAlm/ or /bAm/ (notice optional deletion of /l/), and "walk" is /wAk/ (not to be confused with "wok" /wak/). It appears that I only have two vowels where you have four, and that those two are in the process of merging into one. In these three cases, [Q] & [a] map to /a/, and [A(:)] & [O] map to /A/, which both may ultimately map to /a/. For me, "charm" /tSarm/ is a half-rhyme for "bomb" (half-rhyme since my dialect preserves that postvocalic [r]), though it could be used as a weak rhyme for "balm". To quote Spock, "Fascinating"! Pgdudda
- For me, "bomb" and "balm" are homophones, I think. (I am not sure, since "balm" is a very rare word.) "Walk" definitely has a different vowel sound. I think "walk" and "lawn" have the same vowel sound. --User:Juuitchan
I think we are trying to distinguish two vowels here. One is the vowel of the scream "Aaahhhhh!". The other is "aw" as in "Awwww, poor baby".
- yes, I make those out to be [A] and [O]. I've recorded those too, but since the uploading isn't quite working I'll leave those till later. I put [Q] first since it seems it doesn't occur in US English -- Tarquin
[edit] Sound samples
I've uploaded an OGG file of [Q] examples: media:SAMPA_Q_dot_bomb_wasp_English_RP.ogg. Oops. I hadn't realised it made a page for the media file. Before I upload [A] and [O], does nayone have suggestions for simpler page names? -- Tarquin 05:10 Jul 25, 2002 (PDT)
- Special characters can be a pain, but I love long, descriptive names; less chance of collision. And the sound works well, BTW. --LDC
Any other comments on page names for phoneme samples before I upload some more? -- Tarquin 03:07 Jul 31, 2002 (PDT)
So, what about descriptions based on phonetical criteria and not English dialects? Perhaps 'SAMPA_Q_dot_bomb_wasp_English_RP' is ok to you, but its does not look very informative to me. Maybe a redirect to 'SAMPA_Q_vowel_unrounded_back_low' or some other scheme? -- User:Perique des Palottes.
- That filename is descriptive because it tells you what it contains! The words "dot", "bomb", and "wasp" in English (RP), which are used as examples of words containing SAMPA [Q]. "vowel unrounded back low" are all redundant to the Q, and don't distinguish this file from another sample that might use different words, or words from another language, to demonstrate that same sound. --Brion VIBBER
-
- Ok, I'm convinced. I'll upload the other two I've made & record some more. Tarquin
[edit] Page per phoneme?
The table is starting to look dense. We might also want sound samples in other languages. And there's the linguistic descriptions of the sounds. Maybe we should consider adding no more to the table, and instead making a page for each phoneme. Quesion is, what to call them? -- Tarquin 09:30 Jul 31, 2002 (PDT)
What about returning to a summary introduction, similar to the original we had a few days ago, with a link to a more detailed chart, something like a socompleted version of SAMPA/temp?
Perique des Palottes.
Temporary SAMPA/temp contains:
- Long chart of consonants and vowels
- Concise chart of consonants and vowels for English
Opinions?
But I've noticed there are some holes, the most noticeable one being how come SAMPA does not have a propper sign for British English 'r' (labialised and retroflex)?
2002/08/05 Perique des Palottes.
This was part of the former text:
[edit] Vowels
Sampa | IPA | Examples from languages | ||
English (RP) | French | German | ||
[ { ] | æ - ae ligature
-- low front unrounded vowel |
London English cat | ||
[ E ] | epsilon
-- middle open front unrounded vowel |
bet | même | Herr |
[ e ] | e
-- middle closed front unrounded vowel |
US bear | année | mehr, hin und her |
[ i ] | i
-- closed front unrounded vowel |
meet | vite | mieten |
[ Q ] | turned script a | British English dot
-- low back rounded vowel listen |
gavotte | toll (wonderful) |
[ O ] | turned c
-- middle open back rounded vowel |
British English law | ||
[ o ] | o
-- middle closed back rounded vowel |
U.S. sore | beau | Sohle |
[ u ] | u
-- closed back rounded vowel |
boot | goût | Hut |
[ & ] | Œ - s.c. OE lig
-- low front rounded vowel |
|||
[ 9 ] | œ - oe ligature
-- middle open front rounded vowel |
neuf, hence the symbol 9 | Hölle | |
[ 2 ] | ø
-- middle closed front rounded vowel |
deux, hence the symbol 2 | Höhle | |
[ y ] | y
-- closed front rounded vowel |
du | Tür | |
[ A ] | script a (ie circular, Arial-style)
-- low back unrounded vowel |
spark, arm, US law |
ame, vase, tas | |
[ a ] | a (Times Roman-style)
-- low central unrounded vowel |
bâteau, lac, il plongea | Haar | |
[ @ ] | turned e
-- middle central unrounded vowel |
about, winner (schwa) | machen (informal speech) | |
[ 3 ] | reversed epsilon | English fir, nurse | ||
[ I ] | small capital I | English city | mit | |
[ U ] | approx turned Ω or υ | English book | Mutter | |
[ } ] | Swedish: sju | |||
[ V ] | ^ - turned v | British English fun | Alter |
SAMPA | IPA | Examples from languages | |
Vietnamese | Korean | ||
ɯ (upside-down m) | ư | 으 | |
ɤ (greek gamma squeezed btwn midline and baseline) |
ơ | 어 |
Vowel modifiers:
- [ ~ ] after a vowel indicates that it is nasalised (e.g. French bon [bO~] ).
- [ : ] after a vowel indicates that it is lengthened (e.g. Japanese shōshō [So:So:], English see [si:] ).
[edit] Diphthongs
Sampa | IPA | Examples |
[ eI ] | e; small cap I | English day |
[ aI ] | English my | |
[ OI ] | turned c; small cap I | English boy |
[ @U ] | schwa; upsilon | English no |
[ aU ] | English how | |
[ I@ ] | English near | |
[ e@ ] | English hair | |
[ U@ ] | English poor | |
[ aI@ ] | English fire | |
[ aU@ ] | English sour |
[edit] Consonants
b, d, f, h, k, l, m, n, p, r, t, v, w have their usual English values. The following list of other consonants is not complete.
Sampa | IPA | Examples |
[ g ] | g | English get |
[ j ] | j | English yes |
[ x ] | x | English loch |
[ D ] | ð - eth | English this or that |
[ T ] | θ - theta | English thick |
[ s ] | s | English see, pass |
[ z ] | z | English zoo, rose |
[ S ] | esh (approx. ∫) | English sheep, French chemin |
[ Z ] | ezh (yogh) (approx 3 or script z) | French jour, English pleasure |
[ N ] | right-tail n | English thing |
[ J ] | left-tail n | palatal nasal, Spanish ñ, Italian or French gn, Catalan or Hungarian ny |
[ L ] | reversed lambda | palatal lateral, Italian gli, Catalan or Spanish (some dialects) ll |
[ R ] | small capital R | uvular trill, French or standard German r |
[edit] Affricates
More to be added later
[ dZ ] | English joy | |
[ tS ] | English chair |
SAMPA/To Do: pages with pronunciations that need to be comverted to SAMPA. (Note: the use of SAMPA on Wikipedia has not been decided yet; please wait before converting these pages -- see Wikipedia talk:Manual of Style (pronunciation)) . (Note
I find it odd that this article instructs that the pronunciation for "SAMPA" is /s{mpA:/. In my dialect (Australian English) it's not (usually) possible for a word to en in /A:/ unless it's spelled with a final "r" - and in this case the word would have a "connecting r". I can only think of a few single syllable exceptions. The natural way to pronounce this word for me is /"s{mp@/. Is the /s{mpA:/ pronunciation prescribed somewhere or is this just an observation in one area? Hippietrail 12:30, 28 Mar 2004 (UTC)
- I'm a British RP speaker and I'd say the same. -- Tarquin 12:54, 29 Mar 2004 (UTC)
[edit] SAMPA a hack?
Thanks, Nohat, for your recent fix. It's a compromise that's acceptable to me (the one who reverted Chameleon's edit the first time around). One thing that particularly irked me was the [[Mozilla|recent browsers]] which is now gone. -- pne 10:45, 12 Jul 2004 (UTC)
- Not "recent", but "modern". IE is still lacking in terms of respecting modern web standards. The only thing bad about Nohat's edit is that it doesn't point out that IE is the only browser that doesn't support IPA. — Chameleon My page/My talk 11:13, 12 Jul 2004 (UTC)
-
- OK, "modern", then. Nevertheless, while it was incorrect "back then" for sites to proclaim that "we support both browsers", that's still true today: there are browsers besides MSIE and Netscape/Mozilla/Firefox.
- Even if you mean "IE is the only current browser that doesn't support IPA", it seems wrong to me. How would you know?
- And anyway, I disagree that an encyclopædic article about SAMPA is the place to discuss how specific version of specific browsers support specific features. -- pne 12:33, 12 Jul 2004 (UTC)
- Not really. There are broadly two types of browsers these days: IE, which is not standards compliant, and the others, which are. A third category might include stripped-down browsers with limited features. For example, Links can't show images, and a lot of characters are screwed up, but nobody uses Links expecting to see the page as the webmaster intended.
- It is entirely within the scope of an article that touches on the subject of character encoding (International Phonetic Alphabet, SAMPA, Unicode, etc.) to mention which software supports such features. Unless you're being paid by Microsoft, I don't see the point of trying to conceal this information. — Chameleon My page/My talk 13:42, 12 Jul 2004 (UTC)
-
-
-
- It's not true that IE doesn't "support" IPA. The problem is that if the default font doesn't contain Unicode characters on the page, it doesn't substitute those characters from a different font. If you specify a font that contains Unicode IPA characters, then IE will display them just fine. It's nothing particular to IPA and especially considering that neither SAMPA nor IPA were devised specifically for use on web pages, it isn't really relevant to this page. There is lots of software that doesn't support Unicode very well. What is the point of singling out IE? Nohat 17:58, 12 Jul 2004 (UTC)
-
-
-
-
- When a large group of users can't read the information in their default browser (i.e. Win98/MSIE users), it's certainly appropriate to let them know what they can do to read it.
-
-
-
- Perhaps we need a page at Help:IPA that explains concisely how to set up specific browsers to view IPA. we could use a standard template that links to the page with text like "Pronunciation on this page is displayed in IPA. Problems? See Help:IPA."
- —Michael Z. 16:34, 2004 Oct 2 (UTC)
-
[edit] SAMPA is annoying
What use is this SAMPA? We have IPA already, it is authoritative and widely known. I find these references to SAMPA littered all around Wikipedia, cluttering it up and making it less legible. It seems all the people who know SAMPA are just a subset of those who know IPA anyway, would it not be better for everyone's sanity to just use one transcription? I see no benefit whatsoever in using SAMPA for anything.
- SAMPA is easier to enter for many editors, and as you're aware if you read this talk page, not all users can see IPA characters displayed in their browser. The benefits of getting the information across to readers of Wikipedia are self-evident, no?
- I agree that SAMPA adds clutter and wish we could dispense with it. Once WP switches to UTF-8 and as Win98/MSIE becomes less common, I hope that exclusive IPA will become more acceptable. But keep in mind that many users use computers where they have no control over fonts and settings, and it will be a long time until IPA is universally viewable. —Michael Z. 16:11, 2004 Oct 2 (UTC)
What if articles had a standard "pronunciation guide" box floating at the top-right? All the IPA and SAMPA could be displayed consistently, out of the way, with a standard link to IPA and SAMPA guides. Transliteration schemes for different languages could be linked here too, from pages where they are used. —Michael Z. 16:31, 2004 Oct 2 (UTC)
I have to agree with this user's sentiment. This article (and others I've seen) states that IPA is preferred over SAMPA on Wikipedia, yet there's a ton of SAMPA scattered around the 'pedia (sometimes without even a reference to SAMPA, so >90% of users have got to be totally baffled). I'm on a college campus where computers are overwhelmingly Macs, and Firefox/Safari use is near universal. How many users will be unable to see ðiːz wə˞dz? Is there any data? -leigh (φθόγγος) 23:44, Dec 14, 2004 (UTC)
-
- Just to address the point about some browsers not being able to view IPA symbols correctly, I think this mainly (or perhps only) refers to MSIE, and can in any case be overcome by enclosing the IPA symbols within the IPA template (as I have been doing on a number of articles) so that [ðiːz wə˞dz] become legible on MSIE as well. rossb 23:32, 17 Feb 2005 (UTC)
[edit] SAMPA Chart
- Today, officially, SAMPA has been developed for all the sounds of the following languages:
In the article it says SAMPA has developed for those languages. And there's a link to SAMPA chart for English and SAMPA chart for European languages. Would there be any chart for other languages listed in the article? -- 16:50, January 26, 2005, UTC