software *is* data

Story: A grizzle about captive dataTotal Replies: 7
Author Content
gus3

Jul 22, 2020
6:16 PM EDT
Repeat after me: "Code is data. Data is code." If you don't believe me, take a look at Fabrice Bellard's qemu. It's now possible to emulate a Lisp machine, inside an ARM emulation running FreeBSD, inside a Motorola 68K emulation running NetBSD, itself running on x86-64 Linux.

CPU instructions are numbers. The email to Grandma is numbers. Inputs are numbers. Outputs are numbers. The thing you're 3-D printing might not be a number, but the computer is sending out numbers to control the 3-D printer.

It's numbers, all the way down.

Further reading: Alan Turing's "On Computable Numbers" and Gödel's Incompleteness Theorem, using "Gödelian numbering."

Also suggested: Numberphile's videos about the incompleteness theorem. https://www.youtube.com/watch?v=O4ndIDcDSGc and https://www.youtube.com/watch?v=mccoBBf0VDM
Bob_Mesibov

Jul 23, 2020
5:54 AM EDT
Sorry, gus3, you're missing the point. You might as well say that poetry and news reports on computers are the same because both consist of alphanumerics, punctuation, spaces and control characters, all the way down. To say that Microsoft Excel code is data and the information contained in it is data is absurd reductionism.

"Software" (Wiktionary) is "Encoded computer instructions". In the cases I was discussing, those instructions tell the program how to operate on data strings. Different programs have different instructions for dealing with the same data strings. The instructions for programs that contain data are fundamentally different from the instructions that operate on a stream of data.
jdixon

Jul 23, 2020
11:44 AM EDT
> You might as well say that poetry and news reports on computers are the same because both consist of alphanumerics, punctuation, spaces and control characters, all the way down.

Are they in the same language or not? If they are, gus3 is correct. Computers speak the language of binary numbers.

> The instructions for programs that contain data are fundamentally different from the instructions that operate on a stream of data.

If they're so fundamentally different, try taking a program compiled in an instruction set you don't know and using data in a language you don't know and determining which is which.
gus3

Jul 23, 2020
4:58 PM EDT
When you write a program, then execute it, you're creating a dedicated-purpose virtual machine. And if you give that machine the same inputs, you'll get the same output. In that sense, the input (data) causes the virtual machine to behave identically, every time. Hence, the input (data) acts as instructions to the virtual machine.

So which one was actually causing that behavior? The truth is, both were, equally.
Bob_Mesibov

Jul 23, 2020
5:10 PM EDT
@gus3, you did read the blog post before commenting, didn't you? If you did, it would have been clear that the "data" in "software is not data" is "data" as understood by an astronomer, a statistician or a Big Data analyst, not "data" as seen by a CPU.

You're welcome to believe that people shouldn't use the word "data" in any sense other than the one you prefer, but that doesn't make for helpful comments.
gus3

Jul 23, 2020
6:53 PM EDT
Yes, I did read the whole thing before posting. And I certainly agree with your "grizzle" about file formats. The point of the posting is spot-on. But your assertion that "software is not data" is not.

Okay, taking a different tack: If software is not data, how can a disassembler do its job?
Bob_Mesibov

Jul 23, 2020
7:34 PM EDT
@gus3. Please, let's make sure first that we're talking about the same things. In the blog post and I hope in this discussion, "software" is different from "data". "Software" is what you use to analyse, archive, manage, process etc etc "data". This is higher-level talk, like thinking about poetry as poetry instead of as simple text strings.

You and I know that an Excel file is just data with proprietary format bits added on. You and I know that you can do things with the data if you feed the file to software also called Excel, and that the application and the file are different bits of code.

This is *not* what the vast majority of computer users think, like the dingbat who buried a plain text table in Microsoft Access. The Excel file sits as an icon on their desktop, they double-click the icon and magically the data appear, revealed by and embedded in Excel. It's hardly surprising that the user confuses the data with the software environment in which they see and manipulate it. This is what I mean when I write "many people don't understand that software is not data".

It can be worse, as I'm sure you know. Aunt Mildred complains that her computer says something-or-other. You foolishly ask "What program gave you that message?" and she says "I don't know! It's on the computer!".

Technical competence has lots of degrees and branches. In my data work I mainly deal with technically competent people, but if I recommend to a client that they convert their file encoding from Windows-1252 to UTF-8 and delete the carriage returns at the end of each line, I may get a "I'll get someone to help me with that" response.

In any case, the BASHing data blog is aimed at data workers, for whom "software" is different from "data".
gus3

Jul 23, 2020
9:44 PM EDT
Well, taking your own words: Poetry is simple text strings, within the context of emotions. I'd say that's even true of William Blake's "The Tyger".

And that's the central point: context matters. When a tree falls in the forest, and there's no creature around to hear it, does it make a sound? No, it just makes compressions and rarefactions in the air. "Sound" is how our ears and brains interpret the effects of the tree falling. There's no shortage of videos on YouTube of trees falling. The sound is just a sub-stream which gets converted into voltages on your PC's speakers.

But what if a tree fell into a river, not twenty meters in front of you as you were tubing? That's a very different context, and suddenly the sound turns into an immediate instruction: "watch out!"
Quoting:if I recommend to a client that they convert their file encoding from Windows-1252 to UTF-8 and delete the carriage returns at the end of each line, I may get a "I'll get someone to help me with that" response.
If that's what it comes to, I'd suggest offering to do it yourself, if the client trusts you.

Posting in this forum is limited to members of the group: [ForumMods, SITEADMINS, MEMBERS.]

Becoming a member of LXer is easy and free. Join Us!