treat languages like protocols?

Ranter

Condor

31548

Comments

2

stop

6580

5y

That is the current understanding. But language is complex. The first Duden (the most used dictionary in DACH) had 27000 words, the 28 version has 148000 words, with many words that are bew or removed.
5

Fast-Nop

36583

5y

Wouldn't really work because language is a lot more than only structure (grammar) and tokens (vocabulary). It's also how the tokens are interrelated with each other, invoking side aspects of meaning.

You can express the same underlying topic in different words e.g. for framing events in different narratives, and even if the facts remain the same, the meaning can change drastically. That's a key principle in subtle propaganda.

Language isn't like a taxonomy that you can define. It's a self-referential system, basically a huge circular loop. That's no accident because it's how mind works.

It's also why taxonomic object hierarchies havn't been working out as well as people thought when they were high on multiple level inheritance crack in the 90s - only to discover that you will always run into catdogs sooner or later.
4

deadlyRants

5519

5y

The whole problem behind this idea is that natural language is ambiguous and constantly changing including dialects.

Try writing a C compiler that handles 3 different meanings for the 'return' keyword based on some other context, can change its behaviour if a comment contains sarcasm and can adapt to different syntax variants according to the programmer's place of birth.

Some researchers even stopped any attempt at transcribing spoken language into text before translation and are now using neural nets to directly translate the audio signal into another language, and it works surprisingly well.
1

stop

6580

5y

@deadlyRants my theory is, that this works, because neural nets found the origin of language or an base relative near to it.
1

lorentz

15364

5y

Very often text refers to some implication of the statement (e.g. sarcasm refers to the implications of the statement regarding the speaker) or the phonetic properties of the text (puns) or some falsely implicable but convenient logical connection (doublespeak)
All of these are difficult for computers to even identify, and when done right none of them produce noticeable syntactic anomalies.
1

lorentz

15364

5y

@stop Another reason could be that spoken language is usually orders of magnitude simpler than written language.
1

stop

6580

5y

@homo-lorens i would say its rather complicated. The Text and phonetic influenceceach other. "I love You" can be negated alone with the way it was spoken.
2

lorentz

15364

5y

@stop Certainly, but tone is much simpler for a neural net to crack than the rich context of written text, or the grammatical structures that would represent this negation. I think this is why audio translation works much better, plus the fact that spoken language usually stresses the important parts in a sentence.
3

stop

6580

5y

@homo-lorens some parts make it easier, but in my opinion spoken language is harder because:
1. not everyone speaks the same, especially when it comes to inside jokes that a human sometimes uses.
2. everyone speaks in a dialect, for example how would you translate "Nahd"?
3. not everyone uses the same words/sounds in the same meaning.
As an whole i see spoken language as understanding the text plus understandingwhat modifies the text plus understanding the differences from rhe common language.
1

RememberMe

13617

5y

What you're talking about is something Noam Chomsky formalized with his grammars for generating language sentences. This is what natural language processing folks do, but in practice those grammar rules are not easy to construct at all because of the irregular nature of most languages.

Which is why modern translation systems use deep learning to learn representations that drive translations instead (but yes, there is a lot of domain knowledge that goes into translation). NLP is a pretty fascinating topic.

@Fast-Nop yes but replace @Condor 's "linguists" with deep learning with modern techniques like attention, memory, and recurrence (though practically nobody uses RNNs anymore) and they can learn good enough embeddings and representations to do decently accurate translation. Of course this has to be tuned by experts, but the system works quite well in practice.
1

brunofontes

2025

5y

People had tried it already. Actually, the first MT engines were rule based. We had several translators working on this (that's when I worked on Lionbridge, long time ago).

But AI have better results. Language is more than different words or sentences. Sometimes the string match, but has other feeling. A totally normal phrase in English might sound pretentious in other language even when perfectly translated.

That's why we tend to use native translators whenever is possible.

Language is not so easy.
0

fuckyouall

2801

5y

@stop You have a gross misunderstanding of what machine learning is.
0

stop

6580

5y

@junon probably.
2

Fast-Nop

36583

5y

@junon "Machine Learning" is really just another name for "block chain", which is another name for "cloud" (aka SaaS - Service as a Service).
3

fuckyouall

2801

5y

@Fast-Nop and all of it is Blazing Fast.

Related Rants

Add Comment

question

shower thoughts