I asked the Spanish tutor if he/it was familiar with the terms seseo[0] and ceceo[1] and he said it wasn't, which surprised me. Ideally it would be possible to choose which Spanish dialect to practise as mainland Spain pronunciation is very different to Latin America. In general it didn't convince me it was really hearing how I was pronouncing words, an important part of learning a language. I would say the tutor is useful for intermediate and advanced speakers but not beginners due to this and the speed at which he speaks.
At one point subtitles written in pseudo Chinese characters were shown; I can send a screenshot if this is useful.
The latency was slightly distracting, and as others have commented the NVIDIA Personaplex demos [2] are very impressive in this regard.
In general, a very positive experience, thank you.
Thanks for the feedback. The current avatars use a STT-LLM-TTS pipeline (rather than true speech-to-speech), which limits nuanced understanding of pronunciations. Speech-to-speech models should solve this problem. (The ones we've tried so far have counterintuitively not been fast enough.)
the chinese text happened last night in your main chat agent widget, the cartoon woman professing to be in a town in brazil with a lemon tree on her cupboard. she claimed it was a test of subtitling then admitted it wasn't.
btw, she gives helpful instructions like "/imagine" whatever but the instructions only seem to work about 50% of the time. meaning, try the same command or variants a few times, and it works about half of them. she never did shift out of aussie accent though.
she came up with a remarkably fanciful explanation why as a brazilian she sounded aussie and why imagining native accent like she said would work didn't...
i was shocked when /imagine face left turn to the side did actually work, the agent was in side profile and precisely as natural as the original front facing avatar
all in all, by far the best agent experience i've played with!
So glad you enjoyed it! We've been able to significantly reduce those text hallucinations with a few tricks, but it seems they haven't been fully squashed. The /imagine command only works with the image at the moment, but we'll think about ways to tie that into the personality and voice. Thanks for the feedback!
Each pawn that wants to be promoted either takes:
(a) another 'special' piece (knight/rook/bishop/queen), in which case it has already bought enough bit budget to later be promoted; or
(b) another pawn, in which case this temporarily saves 1 bit (as the other pawn becomes a space), but then later we need 2 extra bits for the promotion, so we pay 1 bit extra per pawn in total
In the case of (b) there are now fewer pawns that can be promoted, and so worst case, we have to pay a budget of 1 bit per each of 8 promoted pawns.
So I think maximum required bits is only 162 + 8 = 170?
So for each 4 pawn cluster, 1 pawn takes another pawn, and the net result is +1 bit once the captor promotes. The remaining 2 pawns in the cluster each need 2 extra bits when promoted => 2 x 2 = 4 bits. So 5 bits per 4-pawn cluster, of which there are 4.
So maximum representation would be 162 + (5 * 4) = 182 bits?
Yep, that increase the total in 3*3-4=5 bits, and you can repeat it 4 times, so the maximum is at least 162+4*5=182.
I'm trying to prove that is the worst case, but there are just too many cases. I guess I'll try to use a program o brute force it or just forget about it.
Actually, given this, we believe that 4 pawns must have been captured to reach 182 bits. So at least 4 pieces no longer need colors. If we store the color mask at the end, I think we can make it variable length, and truncate when no further pieces need colors assigned.
So then we need maximum 182 - 4 = 178 bits
EDIT: Equivalently, we could suffix each non-empty piece in the sequence with an associated color bit
I thought the same but realized you can retrospectively 'insert' the king positions into the position sequence, shifting the remaining sequence one square along for each king, so no more bits required though the data structure is unwieldy!
At one point subtitles written in pseudo Chinese characters were shown; I can send a screenshot if this is useful.
The latency was slightly distracting, and as others have commented the NVIDIA Personaplex demos [2] are very impressive in this regard.
In general, a very positive experience, thank you.
[0] https://en.wikipedia.org/wiki/Phonological_history_of_Spanis... [1] https://en.wikipedia.org/wiki/Phonological_history_of_Spanis... [2] https://research.nvidia.com/labs/adlr/personaplex/
reply