Saturday, July 30, 2011

Language-Learning Software; Bottom-Up?

As I've said, this year I've been working some on modern Greek, and thinking about it. My context, though, is a little more than simply that of a programmer with Greek family but no aptitude for languages. For more than ten years now, I've been on-and-off involved with media-annotation software that was mostly intended for teaching languages in a higher-education setting. (See, e.g., a 2005 Semantic Web Applications writeup as "Semantic Annotations for Digital Video" (PDF).) Quite a few Colgate students have learned Russian using Russian video with time-aligned transcript and commentary, texts that I can't read but that were linked by code I wrote. (The commentary may or may not be just one "layer", and may or may not include a translation; the time-aligned transcript starts as the sort of thing you create in transcriber, which I had scripts for the previous version of, or with ELAN — Language Archiving Technology; I've contributed code to ELAN, and I like its design and I'm doing more work on it this year, but I end up wanting an actual editable HTML page to contain the transcript etc. I'm hoping to do it all in HTML5, Real Soon Now.)

Of course, I'm not trying to learn modern Greek in a higher-education setting; I'm doing that on my own. Self-study software? I did get the Rosetta Stone Greek Level 1-2 Set and used it for a while (the little doggie at my feet is in fact named Rosetta, and that's why.) I haven't used it for a long time, though, and I really can't do it while trying for nine-minute miles on the elliptical. Still, the idea of language learning as a figure-it-out-as-you-go-along matching game does appeal; it's a common way for game worlds to work, and I have thought about implementing it within a world of time-aligned transcript+commentary, where we match sentences and play the selection...Hmm. But that would be further from my own personal current requirements than Rosetta itself. Well, since I was using actual flashcards for exercise-study and then using texts that I could think of as flashcards, it obviously might make sense to use flashcard software, say The Mnemosyne Project offering

  • Efficient scheduling algorithm, so you don't waste time on things you know well
  • Support for languages using different scripts through unicode
  • Support for pictures, sounds and html formatting
  • Support for three-sided cards, e.g. foreign words where you are interested in written form, pronunciation and translation

Mostly, though, I've been using pages that look like this fragment, from Dover's Listen & Learn Modern Greek:

I treat these as three-sided cards of variable size; in other words, I try to memorize a word at a time, then a phrase at a time, then a clause at a time, then a sentence at a time, and maybe even a dialog-fragment at a time. I start with the English and phonetic rendering, over and over, one word at a time, and then when I come back to a piece later I try to pay attention only to the Greek text. Repeat, repeat, repeat. Hmm....

My ideal software, maybe, would have time-aligned video dialog scenes (pretend or even real restaurant scene, airport scene, hotel scene...) done at full speed by native speakers and time-aligned sentence by sentence, but that wouldn't be the starting point. That would be the goal. The starting point for each scene would use the same transcripts spoken much more slowly with alignment points between each word and the next, and markup to indicate phrase/clause/sentence/paragraph structure.

The user would start with word-at-a-time audio to play over and over while getting used to the English; in effect, single-word flashcards with audio support. As those got familiar, they'd be merged into phrases and so on up; a Mnemosyne-like strategy would have to track which phrases contain only words you've adequately learned, which clauses contain only phrases you've adequately learned, and so on. There would also be auxiliary pseudo-scenes for verb conjugations and such, but they'd be dealt with in the same way: bottom-up.

Sure, it would be nice to click "KRAH-tee-seh" or the Greek it represents and get a dictionary entry which identified as much as possible. But that kind of thing is easy enough; it so happens that I'm spending part of my work-time trying to improve the extent to which ELAN can talk to SIL's FLEx, which does dictionaries nicely. (As long as somebody creates the dictionary data; content is hard.) And it might be nice to add a link to a Google Translate suggestion, in this case "κράτα τα ρέστα". Many things might be nice, but the idea I'm thinking about has what I'd like to call bottom-up flashcards, based on an ELAN-style breakdown of each sentence as a basic framework, with the actual dictionary for the bottom level if possible. Hmm.

I would really like a piece of software that supports this kind of use-case, and maybe one exists. Or it's possible that the code I work with now can be modified to do so.

Of course, maybe I shouldn't be thinking about this as a strategy; it may be a bad strategy even if implemented well. If I were a foreign-language teacher, or even a good foreign-language student, I'd probably know better.

Or then again, maybe not. I dunno.

Labels: , , ,


Post a Comment

<< Home