Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Audio CDs were never ripped/transferred as ISO files. ISO-9660 is a filesystem that came years later, and Redbook audio CDs simply do not contain files.

If you want to look at the structure of a whole audio CD, then one way is to rip it with a decent tool (perhaps cdrdao or EAC) and generate a bin/cue file pair as an output.



But that's not my goal. I'd like to be able to observe every grove, the physical encoding of data, and see if I could implement decoding from scratch. First problem is though that I don't know how to get a microscopic image of the disc.


You don't need a microscopic image of a disc to do that; a two-dimensional photograph is of essentially no advantage here.

All you need is the unmolested data from that disc. The data is arranged on a singular spiral groove starting from the center and slowly winding its way towards the outside.

The data is completely linear: It begins at the beginning, and continues to the very end without interruption. This is all akin to (although opposite of) how a single-track vinyl record is physically laid out. The entire CD -- whatever it contains -- is just a continuous string of pits and lands.

And to observe that string as it appears on a real disc, all you need to get started is a regular old-school CD player and some appropriate data acquisition gear, and maybe an oscilloscope to help figure out what you're looking at.

The optics and basic motor controls are already solved problems, and it doesn't even have to be particularly fast data acquisition gear by today's standards to record what is happening.


Look into the Domesday Duplicator project for Laserdiscs as an example of how what ssl-3 is talking about can be done using a high sample rate input. That exact process is possible and with enough storage and processing power can be used to get the most "low level" access to the data. It is not for the faint of heart though, and can take around 1TB of storage and hours of CPU time to process full movies in this way, I know because I've done it.

I believe I've seen there is work being done to attempt this on CDs but it would have still been in the exploratory phases and not yet ready to start archiving with. It might seem like overkill to do this to something meant to be digitally addressed but I've experienced enough quirks with discs and drives when ripping that I would 100% be willing to switch over to a known complete capture system to not have to worry about it anymore. Post process decoding also allows for re-decoding data later if better methods are found.


The "unmolested data" would still have undergone error correction though, wouldn't it? I don't think a bin/cue rip would contain the redundant stuff, which GP seems interested in, nor the subcodes (of which some are represented in the cue file, while the bin file is PCM audio).

And at the risk of taking us well beyond the rainbow books, I'll just leave this here: https://www.psxdev.net/forum/viewtopic.php?f=70&t=1266


There is a layer betwixt the optical reflection and the audio output that exists only as raw signals, before any molestation/error correction occurs.

There cannot not be this layer.

(And with a sufficiently-old-school CD player, it is probably not even challenging to get to it. The less-integrated the parts are, the better.)


Ah, I see. So what kind of capture hardware could read from that point? I assume it's a digital signal taking the form of 2-voltages, flipping on the order of 3.6 MHz (16 billion pits to read over 74*60 seconds). With Red Book audio at 1.4 Mbps, more than half of the raw data must be devoted to things like redundancy and other non-PCM stuff, if my interpretation that pits==bits isn't far off.

Aside: is your username inspired by Secure Socket Layer or Solid State Logic?


I'm getting off into the weeds of what I know here, so take this all with a grain of salt. (I probably used to know more about all of this than I do right now.)

The difference between a pit and a land is an optical phase change. The pits and lands vary in length, and there are 9 valid variations in their lengths. This combined phase/temporal situation eventually (thanks, science folks from 1970-something!) turns into a serial binary electrical signal inside of a CD player.

This binary electrical signal can be recorded.

Recorded with what, you asked?

CDs have a lot more going on than just audio data: Remember, there's forward error correction at play and (by spec, IIRC) a player is supposed to be able to completely recover data even if there is a gap of 1mm due to a scratch or other interruption. (There's also room for tricks like CD+G to live in the background, and certainly what may seem like an inordinate amount of data used just for clocking: CDs are CLV, so playing them happens at a continuously-varying rotational speed in a tightly closed loop because buffer RAM was expensive to buy, and expensive to manage, and tight speed control was cheaper to implement. Remember, this was a finished digital product that was released in 1981.)

I find old references[0] that suggest that the raw data rate of a CD (it does not matter what kind) is 4.3218 Mbps.

So, to posit some example hardware: With careful loops and decent wiring, accurately capturing this seems like it would be well within the purvey of an RP2040's PIO's DMA modes to get that data into RAM, and also well within one of its 133MHz 32-bit ARM core's ability to package up and deliver that data over USB 2 to a host machine that can store it for later analysis -- plus or minus a transistor or two, or maybe a pullup resistor in just the right spot.

(But that's just my opinion as a home hacker who has dabbled in RP2040 PIO assembler, and who is at or a bit beyond their knowledge of compact discs. I may wake up tomorrow and decide that the above is all bullshit and wish I could erase all of it. If in doubt, Phillips datasheets for CD player chipsets from the first half of the 1980s can probably help a lot more than I can.)

---

As to the username: It's old. It predates Secure Socket Layer, but it's way newer than Solid State Logic. I was just a young kid with a new modem when I dialed into a Telegard BBS and started to sign up for an account, and got stuck at the prompt to enter a "Handle". I didn't know what a handle was in this new-to-me context.

The sysop saw that I was stuck and dragged me into chat, as good sysops (hi Shawn!) tended to do upon seeing such a thing. We chatted for a bit, and I wasn't feeling creative, so he suggested that maybe I could look around for inspiration since most people used a made-up handle on his particular BBS.

I found a 5.25" floppy disk on the desk that I'd borrowed from the local public library. It was labeled "Selective Shareware Library, Volume 3." (It was also almost certainly infected with the Stoned virus[1]).

Anyhow, that was sufficiently inspiring, so ssl-3 it was.

---

0: https://www.geocities.ws/columbiaisa/cd_specs.htm

1: https://en.wikipedia.org/wiki/Stoned_(computer_virus)


Not necessarily. It depends if you're extracting data+subchannel data or corrected track data only.


You might do well enough with https://en.wikipedia.org/wiki/Cdparanoia without needing use different hardware to scan the disc. Instead it relies on the CD drive's ability to report on inaccuracies in keeping in sync with the grooves.



I wonder if you could just tear the controller out of a CD/DVD drive and build a new one from scratch, kind of like the new floppy controllers being used now to read the raw magnetic data. You could just command the head to move to the center, find the beginning of the data and just keep reading until you hit the buffers.


Sorta, kinda? It's a bit of a different game.

Floppies (most of them, anyway) have fixed track widths, and these tracks are arranged cylindrically, and these cylinders align with the steps of the stepper motor that is used to actuate the head assembly.

It's relatively easy, with the right ratio betwixt step advancement and track width, to get the head moving properly on a new implementation of a floppy controller. Want to read track 1? Step to the head N times to reach track 1 from wherever it started, and read it. Next, want to read track 33? Step the head N times to track 33, and read that.

But tracking the spiral groove of a CD is a very different problem to solve. Steps tend to lose their meaning. Instead of electromagnetic steps, it involves 3 different laser beams: Two to continuously keep the head centered where it needs to be on the ever-changing groove using a servo feedback loop, and a third to read the data from the pits and lands from the middle of that groove.

Is it do-able? Sure! People with far less advanced tech than we on HN might have laying around did it 40+ years ago.

It's just a very different nut to crack than reading a floppy is, even if the mechanical and optical bits are recycled.

(And that's just head positioning. The pits and lands still needs to be read, and those reflect back from the disc as optical phase shifts, not as changes in magnetic polarity and/or amplitude.)


Why? You can extract raw data and raw subchannel data directly from a CD/DVD drive. This isn't the case with how floppy drives work.


The "why" was covered in a parent comment: https://news.ycombinator.com/item?id=40923030


I can read, thanks. There is no benefit to it. If the desire were to look at them out of curiosity, a microscope would do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: