"Ithaca restored artificially produced gaps in ancient texts with 62% accuracy, compared with 25% for human experts. But experts aided by Ithaca’s suggestions had the best results of all, filling gaps with an accuracy of 72%. Ithaca also identified the geographical origins of inscriptions with 71% accuracy, and dated them to within 30 years of accepted estimates."
and
"[Using] an RNN to restore missing text from a series of 1,100 Mycenaean tablets ... written in a script called Linear B in the second millennium bc. In tests with artificially produced gaps, the model’s top ten predictions included the correct answer 72% of the time, and in real-world cases it often matched the suggestions of human specialists."
Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.
Obviously, they've thought of that, and it's worth experimenting with these powerful tools. But I wonder how they've solved that problem.
sapphicsnail 5 days ago [-]
> Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.
Without an extant text to compare, everything would be a guess. Maybe this would be helpful if you're trying to get a rough and dirty translation of a bunch of papyri or inscriptions? Until we have an AI that's able to adequately explain it's reasoning I can't see this replacing philologists with domain-specific expertise who are able to walk you through the choices they made.
EA-3167 5 days ago [-]
I wonder if maybe the goal is to provide the actual scholars with options, approaches or translations they hadn't thought of yet. In essence just what you said, structured guessing, but if you can have a well-trained bot guess within specific bounds countless times and output the patterns in the guesses, maybe it would be enough. Not, "My AI translated this ancient fragment of text," but "My AI sent us in a direction we hadn't previously had the time or inclination to explore, which turned out to be fruitful."
mmooss 5 days ago [-]
I agree, but lets remember that the software repeats patterns, it doesn't so much innovate new ones. If you get too dependent on it, theoretically you might not break as much new ground, find new paradigms, discover the long-mistaken assumption in prior scholarship (that the software is repeating), etc.
Zancarius 4 days ago [-]
Human proclivities tend toward repetition as well, partially as a memory/mnemonic device, so I don't see this as disadvantageous. For example, there's a minor opinion in biblical scholarship that John 21 was a later scribal addition because of the end of John 20 seeming to mark the end of the book itself. However, John's tendencies to use specific verbiage and structure provides a much stronger argument that the book was written by the same author—including chapter 21—suggesting that the last chapter is an epilogue.
Care needs to be taken, of course, but ancient works often followed certain patterns or linguistic choices that could be used to identify authorship. As long as this is viewed as one tool of many, there's unlikely much harm unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).
mmooss 4 days ago [-]
> unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).
This is what I was talking about. Knowledge and ideas develop often by violating the prior patterns. If your tool is (theoretically) built to repeat the prior patterns and it frames your work, you might not be as innovative. But this is all very speculative.
Validark 4 days ago [-]
Interesting point in theory but I'd love to get to the point where our problem is that we solved all the problems we already know how to solve.
rnd0 4 days ago [-]
Thank you, and also I'd like to know how they'd even evaluate the results to begin with...
I hope to GOD they're holding on to the originals so they can go back and redo this in 20,30 years when tools have improved.
manquer 4 days ago [-]
If the texts are truly missing , then accuracy is subjective ? i.e. human opinion versus AI generation
ip26 4 days ago [-]
artificially produced gaps in ancient texts
Someone deleted part of a known text.
This does require the AI hasn’t been trained on the test text previously..
rtkwe 4 days ago [-]
They do mention that the missing data test was done on "new" data that the models were not viewed trained on in the article so it's not just regurgitation for at least some of the results it seems.
BeefWellington 4 days ago [-]
One way to test this kind of efficacy is to compare it to a known sample with a missing piece, e.g.: create an artifact with known text, destroy it in similar fashion, compare what this model suggests as outputs with the real known text.
The "known" sample would need to be handled and controlled for by an independent trusted party, obviously, and therein lies the problem: It will be hard to properly configure an experiment and believe it if any of the parties have any kind of vested interest in the success of the project.
mmooss 4 days ago [-]
> If the texts are truly missing , then accuracy is subjective ?
Then accuracy might be unknown but it's not subjective.
userbinator 5 days ago [-]
The full title is "How AI is unlocking ancient texts — and could rewrite history", and that second part is especially fitting, although unfortunately not mentioned in the article itself, which is full of rather horrifying stories about using AI to "fill in" missing data, which is clearly not true data recovery in any meaningful sense.
I am aware of how advanced algorithms such as those used for flash memory today can "recover" data from imperfect probability distributions naturally created by NAND flash operation, but there seems to a huge gap between those, which are based on well-understood information-theoretic principles, and the AI techniques described here.
mistrial9 4 days ago [-]
related -- A Chinese woman, known by the pseudonym Zhemao, was found to have been rewriting and falsifying Russian history on Wikipedia for over a decade. She created over 200 detailed articles on the Chinese Wikipedia, which included fictitious states, battles, and aristocrats. The hoax was exposed when a Chinese novelist, Yifan, researching for a book, stumbled upon an article about the Kashin silver mine. Yifan noticed that the article contained extensive details that were not supported by other language versions of Wikipedia, including the Russian one.
Zhemao, posing as a scholar, claimed to have a Ph.D. in world history from Moscow State University and was the daughter of a Chinese diplomat based in Russia. She used machine translation to understand Russian-language sources and filled in gaps with her own imagination. Her articles were well-crafted and included detailed references, making them appear credible. However, many of the sources cited were either fake or did not exist.
The articles were eventually investigated by Wikipedia editors who found that Zhemao had used multiple “puppet accounts” to lend credibility to her edits. Following the investigation, Zhemao was banned from Chinese Wikipedia, and her edits were deleted.
adriand 5 days ago [-]
I find this incredibly exciting. There could be some truly remarkable works whose contents are about to be revealed, and we don’t really know what we might find. Histories of the ancient (more ancient) world. Accounts of contact with cultures and civilizations that are currently lost to history. Scientific and mathematical discoveries. And what I often find to be the most moving: stories of daily life that illuminate what regular people thought and felt and experienced thousands of years ago.
Applejinx 4 days ago [-]
Which becomes a real gotcha when it turns out to be hallucinated 'content' misleading people into following their assumptions on what regular people thought and felt and experienced thousands of years ago.
What we call AI does have superhuman powers but they are not powers of insight, they are powers of generalization. AI is more capable than a human is of homogenizing experience down to what a current snapshot of 'human thought' would be, because it's by definition PEOPLE rather than 'person'. The effort to invoke a specific perspective from it (that seems ubiquitous) sees AI at its worst. This idea that you could use it to correctly extract a specific perspective from the long dead, is wildly, wildly misguided.
sans_souse 4 days ago [-]
This concerns me. How do we assess the AI's interpretation when it comes to what we ourselves can't see? Have we not learned that AI desparately wants to supply answers to the point it prioritizes answers over accuracy? We already lose enough in translation, and do well to twist those words we can discern - I'd really prefer we not start filling the gaps with lies formed of regurgitated data pools where it's most likely sourcing whatever fabricated fluff it does end up using to fill in said gaps.
indymike 4 days ago [-]
> This concerns me. How do we assess the AI's interpretation when it comes to what we ourselves can't see?
Sometimes a clue or nudge can trigger a cascade of discovery. Even if that clue is wrong, it causes people to look at something they maybe never would have. In any case, so long as we're reasonably skeptical this is really no different than a very human way of working... have you tried "...fill in wild idea..."
> I'd really prefer we not start filling the gaps with lies formed of regurgitated data pools
A lie requires an intent to deceive and that is beyond the capability of modern AI. In many cases lie can reveal adjacent truth - and I suspect that is what is happening. Regardless, finding truth in history is really hard because many times, the record is filled with actual lies intended to make the victor, ruler or look better.
d357r0y3r 4 days ago [-]
The AI interpretation can be folded into a multidisciplinary approach. We wouldn't merely take AI's word for it. Does this interpretation make sense given what historians and anthropologists have learned, etc.
dismalaf 4 days ago [-]
> Have we not learned that AI desparately wants to supply answers to the point it prioritizes answers over accuracy?
Have you ever met an archaeologist?
throw4847285 4 days ago [-]
Yeah, I know a number of archaeologists. Among academics, they are some of the most conservative when it comes to drawing sweeping conclusions from their research. A thesis defense is a brutal exercise in being accused of crimes against parsimony by your mentors and peers.
Electricniko 4 days ago [-]
I like to think that today's clickbait data pools are perfect for translating ancient texts. The software will see modern headlines like "Politician roasts the opposition for spending cuts" and come up with translations like "Emperor A roasted his enemies" and it will still be correct.
palmfacehn 4 days ago [-]
What is 'accuracy' when examined at depth?
With the benefit of greater knowledge and context we are able to critique some of the answers provided by today's LLMs. With the benefit of hindsight we are able to see where past academics and thought leaders went wrong. This isn't the same as confirming that our own position is a zenith of understanding. It would be more reasonable to assume it is a false summit.
Could we not also say that academics have a priority to "publish or perish"? When we use the benefits of hindsight to examine debunked theories, could we not also say that they were too eager to supply answers?
I agree about models filling the gaps with whatever is most probable. That's what they are designed to do. My quibble is that humans often synthesize the least objectionable answers based on group-think, institutional norms and pure laziness.
watt 4 days ago [-]
why wouldn't you prefer _something_ over _nothing_. I assume AI steps in for issues that people haven't been able to begin to solve in decades.
Majestic121 4 days ago [-]
It's much better to have _nothing_ than the wrong _something_, since with a wrong _something_ you build assumptions on wrong premises.
Much better to accept that we don't know (hopefully temporarily), so that people can keep looking into it instead of falsely believing the problem is solved
davidclark 4 days ago [-]
Absolutely prefer nothing here.
throw4847285 4 days ago [-]
I bet Heinrich Schliemann would have loved AI.
xenospn 4 days ago [-]
That _something_ could be worse than nothing.
teleforce 4 days ago [-]
I hope we can decipher Indus script using AI or not [1].
It's well overdue although from statistical profiling it's believed to be a valid linguistic script being used for writing system of the ancient Harappan language, the likely precursor of modern Dravidian languages.
Claude is all too willing to provide interpretations. Why not give it a go and see if you can’t crack it yourself? Hypothesis generation is needed!
taffronaut 5 days ago [-]
From TFA "decoding rare and lost languages of which hardly any traces survive". Assuming that's not hype, let's see it have a go at Rongorongo[1] then.
It's mentioned in the article that they hope the model for Linear B can also help with Linear A.
mlepath 5 days ago [-]
This is a great application of various domains of ML. This reminds me of Vesuvius Challenge. This kid of thing is accessible to beginners too since the data by definition are pretty limitted.
jhanschoo 4 days ago [-]
Perhaps you missed it while skimming, but indeed, the Vesuvius Challenge is a primary topic of discussion in the article :)
cormorant 5 days ago [-]
Does anyone have a subscription or can otherwise read past the heading "A flood of information"? (I can see ~2500 words but there is apparently more.)
It's cut off on archive.is too. Can Springer Nature not afford us all to be able to read the full article or what? Do they really need $70 for a single page of information again?
zozbot234 5 days ago [-]
The really nice thing about this is that the AI can now acquire these newly-decoded texts as part of its training set, and begin learning at a geometric rate.
nitwit005 4 days ago [-]
With our current methods, feeding back even fairly small amounts of outputs back in as training data leads to declining performance.
Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.
red75prime 4 days ago [-]
Degradation of autoregressive models being fed their own unfiltered output is pretty obvious: it's, basically, noise being injected into the ground truth probability distribution.
But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.
zeofig 4 days ago [-]
Why not just feed it random data? It's so smart that it will figure out which parts are random, so eventually you will generate some good data randomly, and it will feed on it, and become exponentially smarter exponentially fast.
Validark 4 days ago [-]
This is actually hilarious and I'm sad you are getting downvoted for it.
But do I want to see ancient programming advice written in Linear B?
Oarch 4 days ago [-]
What if the deciphered content is the ancient equivalent of Vogon poetry? Do we stop?
Octoth0rpe 4 days ago [-]
No, but the translation process would transfer from academia to the military industrial complex.
aaronbrethorst 5 days ago [-]
(2024)
5 days ago [-]
Tagbert 5 days ago [-]
:-)
yzydserd 5 days ago [-]
Klaatu barada nikto
masfoobar 3 days ago [-]
Lets hope the AI tool being used does not convert to a particular religion and distorts ancient texts that challenges their beliefs, especially if it predates their "historically accurate" stories by hundreds or thousands of years.
human - "Please compile these texts"
AI - "done! Here you go"
human - "114AD? Are you sure. We expect this to be around 100BC"
AI - "NO! Nothing to see, here! Water turning into Wine was clearly added to this God AFTER our lord and saviour!"
human - "But I have this thing.."
AI - "NOTHING TO SEE, HERE! THEY WERE PLANTED TO TEST US! TEST US!"
...
...
"BURN IT!"
Sparkyte 5 days ago [-]
Can't wait to read ancient smutt from the time.
sapphicsnail 5 days ago [-]
I wouldn't call in smut but there are 5 surviving Greek novels and some Roman elegaic poetry that's a little horny. We know there used to be a lot of crazier stuff but it mostly doesn't survive.
InsOp 4 days ago [-]
are there any news in the voynich code?
starlite-5008 4 days ago [-]
[dead]
fredtalty5 7 days ago [-]
[dead]
datavirtue 5 days ago [-]
Nothing to see here. LLMs and AI suck and aren't really good at anything. /s
The world is about to change much faster than any of us have ever witnessed to this point. What a life.
muglug 5 days ago [-]
There’s a big difference between LLMs and this application of CNNs and RNNs.
Very few people on HN are claiming there’s no value to neural networks — CNNs have been heralded here for well over a decade.
mcphage 5 days ago [-]
There are definitely things they’re good at. And there’s definitely things that they’re bad at, worse than nothing at all. The problem is how often they’re being used in the later case, and how rarely in the former.
zeofig 4 days ago [-]
Build a strawman, knock him down, and plant the glorious flag of hyperbole on his strawwy corpse.
fuzztester 5 days ago [-]
>How AI is
I almost read it as "How Ali is" due to speed reading and the font in the original article. :)
And now I wonder how AI would do on that same test :)
Chat, GPT!
Rendered at 14:45:59 GMT+0000 (UTC) with Wasmer Edge.
"Ithaca restored artificially produced gaps in ancient texts with 62% accuracy, compared with 25% for human experts. But experts aided by Ithaca’s suggestions had the best results of all, filling gaps with an accuracy of 72%. Ithaca also identified the geographical origins of inscriptions with 71% accuracy, and dated them to within 30 years of accepted estimates."
and
"[Using] an RNN to restore missing text from a series of 1,100 Mycenaean tablets ... written in a script called Linear B in the second millennium bc. In tests with artificially produced gaps, the model’s top ten predictions included the correct answer 72% of the time, and in real-world cases it often matched the suggestions of human specialists."
Obviously 62%, 72%, 72% in ten tries, etc. is not sufficient by itself. How do scholars use these tools? Without some external source to verify the truth, you can't know if the software output is accurate. And if you have some reliable external source, you don't need the software.
Obviously, they've thought of that, and it's worth experimenting with these powerful tools. But I wonder how they've solved that problem.
Without an extant text to compare, everything would be a guess. Maybe this would be helpful if you're trying to get a rough and dirty translation of a bunch of papyri or inscriptions? Until we have an AI that's able to adequately explain it's reasoning I can't see this replacing philologists with domain-specific expertise who are able to walk you through the choices they made.
Care needs to be taken, of course, but ancient works often followed certain patterns or linguistic choices that could be used to identify authorship. As long as this is viewed as one tool of many, there's unlikely much harm unless scholars lean too heavily on the opinions of AI analysis (which is the real risk, IMO).
This is what I was talking about. Knowledge and ideas develop often by violating the prior patterns. If your tool is (theoretically) built to repeat the prior patterns and it frames your work, you might not be as innovative. But this is all very speculative.
I hope to GOD they're holding on to the originals so they can go back and redo this in 20,30 years when tools have improved.
Someone deleted part of a known text.
This does require the AI hasn’t been trained on the test text previously..
The "known" sample would need to be handled and controlled for by an independent trusted party, obviously, and therein lies the problem: It will be hard to properly configure an experiment and believe it if any of the parties have any kind of vested interest in the success of the project.
Then accuracy might be unknown but it's not subjective.
I am aware of how advanced algorithms such as those used for flash memory today can "recover" data from imperfect probability distributions naturally created by NAND flash operation, but there seems to a huge gap between those, which are based on well-understood information-theoretic principles, and the AI techniques described here.
Zhemao, posing as a scholar, claimed to have a Ph.D. in world history from Moscow State University and was the daughter of a Chinese diplomat based in Russia. She used machine translation to understand Russian-language sources and filled in gaps with her own imagination. Her articles were well-crafted and included detailed references, making them appear credible. However, many of the sources cited were either fake or did not exist.
The articles were eventually investigated by Wikipedia editors who found that Zhemao had used multiple “puppet accounts” to lend credibility to her edits. Following the investigation, Zhemao was banned from Chinese Wikipedia, and her edits were deleted.
What we call AI does have superhuman powers but they are not powers of insight, they are powers of generalization. AI is more capable than a human is of homogenizing experience down to what a current snapshot of 'human thought' would be, because it's by definition PEOPLE rather than 'person'. The effort to invoke a specific perspective from it (that seems ubiquitous) sees AI at its worst. This idea that you could use it to correctly extract a specific perspective from the long dead, is wildly, wildly misguided.
Sometimes a clue or nudge can trigger a cascade of discovery. Even if that clue is wrong, it causes people to look at something they maybe never would have. In any case, so long as we're reasonably skeptical this is really no different than a very human way of working... have you tried "...fill in wild idea..."
> I'd really prefer we not start filling the gaps with lies formed of regurgitated data pools
A lie requires an intent to deceive and that is beyond the capability of modern AI. In many cases lie can reveal adjacent truth - and I suspect that is what is happening. Regardless, finding truth in history is really hard because many times, the record is filled with actual lies intended to make the victor, ruler or look better.
Have you ever met an archaeologist?
With the benefit of greater knowledge and context we are able to critique some of the answers provided by today's LLMs. With the benefit of hindsight we are able to see where past academics and thought leaders went wrong. This isn't the same as confirming that our own position is a zenith of understanding. It would be more reasonable to assume it is a false summit.
Could we not also say that academics have a priority to "publish or perish"? When we use the benefits of hindsight to examine debunked theories, could we not also say that they were too eager to supply answers?
I agree about models filling the gaps with whatever is most probable. That's what they are designed to do. My quibble is that humans often synthesize the least objectionable answers based on group-think, institutional norms and pure laziness.
It's well overdue although from statistical profiling it's believed to be a valid linguistic script being used for writing system of the ancient Harappan language, the likely precursor of modern Dravidian languages.
[1] Indus script:
https://en.wikipedia.org/wiki/Indus_script
[1] https://en.m.wikipedia.org/wiki/Rongorongo
https://en.wikipedia.org/wiki/Linear_A
Just think of it abstractly. The AI will be trained on the errors the previous generation made. As long as it keeps making new errors each generation, they will tend to multiply.
But. "Our current methods" include reinforcement learning. So long as there's a signal indicating better solutions, performance tends to improve.
https://x.com/i/grok/share/uMwJwGkl2XVUep0N4ZPV1QUx6
human - "Please compile these texts"
AI - "done! Here you go"
human - "114AD? Are you sure. We expect this to be around 100BC"
AI - "NO! Nothing to see, here! Water turning into Wine was clearly added to this God AFTER our lord and saviour!"
human - "But I have this thing.."
AI - "NOTHING TO SEE, HERE! THEY WERE PLANTED TO TEST US! TEST US!"
...
...
"BURN IT!"
The world is about to change much faster than any of us have ever witnessed to this point. What a life.
Very few people on HN are claiming there’s no value to neural networks — CNNs have been heralded here for well over a decade.
I almost read it as "How Ali is" due to speed reading and the font in the original article. :)
And now I wonder how AI would do on that same test :)
Chat, GPT!