|
|
Subscribe / Log in / New account

The Linus and Dirk show

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jake Edge
May 30, 2013
LinuxCon Japan 2013

Linus Torvalds and Dirk Hohndel sat down at LinuxCon Japan 2013 for a "fireside chat" (sans fire), ostensibly to discuss where Linux is going. While they touched on that subject, the conversation was wide-ranging over both Linux and non-Linux topics, from privacy to diversity and from educational systems to how operating systems will look in 20-30 years. Some rather interesting questions—seemingly different from those that might be asked at a US or European conference—were asked along the way.

[Linus Torvalds and Dirk Hohndel]

Hohndel is the CTO of the Intel Open Source Technology Center, and Torvalds "needs no introduction" as Linux Foundation executive director Jim Zemlin said at the outset. Given Zemlin's comment, though, Hohndel asked how Torvalds introduces himself, does he mention that he is a "benevolent dictator", for example. But Torvalds said that in normal life, he doesn't try to pull Linux into things, he just introduces himself as "Linus" (interestingly, he pronounced it as lie-nus) and leaves it at that. He has, he says, a regular life and doesn't get recognized in the streets—something he seemed quite happy about.

Releases and merging

The 3.10-rc3 release was made just before they had left Portland for Tokyo, so Hohndel asked about where things are heading and what can be expected in coming releases. Torvalds said that the kernel release cycle has been "very stable" over the last few years and that there are never any plans for new features and when they will be released. Instead, he releases what is ready at the time the cycle starts. People know that if they miss a particular release with their new feature, ten weeks down the line there will be another merge window for it to get added.

Most of the changes that go in these days are for new hardware, as the core has stabilized, Torvalds said. The changes for new hardware come both in drivers and in support for new CPUs, particularly in the ARM world. Over the last few years, there has been a lot of work to clean up the ARM tree, which was a mess but has gotten "much better". These days, ARM and x86 are the two architectures that get the most attention, but Linux supports twenty or so architectures.

Noting that Torvalds had seemed a little more unhappy than usual recently, Hohndel asked if it was caused by the number of patches he was merging. Hohndel said that the 3.10 merge window was the largest ever, and that the -rc3 announcement showed some displeasure with how things were going. Torvalds said that the size of the merge window was not really a problem, unless the code merged is "untested and flaky". It is a problem when there are a lot of fixes to the newly merged features that come in during the -rc releases. "I want code to be ready" when its merge is requested. Given the ten week release schedule, there are only six or seven weeks to get everything to work, so he is unhappy when people ask him to merge code that makes it harder to converge on the final release. When that happens, it results in "a lot of cursing", he said.

If he gets "too annoyed" at some subsystem or area of the kernel, Torvalds sometimes resorts to refusing to pull code from the developer(s) in question. It is "the only thing I can do", when things get to that point. It is an indication that "you need to clean up your process because I don't want the pain you are causing". Normally that happens in private with a rejection of a pull request in a "try again next time" message, but sometimes he does it publicly. His job is to integrate changes, so he wants to say "yes", which makes it painful for both sides when he gets too frustrated and has to say "no".

Diversity

Hohndel noted that Kernel Summit pictures tend to contain only white males, but he thinks we are making some progress on making the kernel community more representative of the world we live in; "is it improving?", he asked. Torvalds said that he thinks it is improving, but that the Kernel Summit is the "worst possible example" because it mostly represents those who have been involved for 10-15 years. In the early days, Linux was mostly developed in western Europe and the US, which makes the diversity at the summit rather low.

Beyond geographic diversity, the number of women in the open source software world is low, though Torvalds is not clear on why that is. It is getting better through efforts by companies like Intel and organizations like the Linux Foundation to help women get more involved so that the community won't be so "one-sided". He noted that there were few Japanese kernel developers when the first Japan Linux Symposiums started, but that has now changed. Japan (and Asia in general) are much better represented these days.

[Linus Torvalds and Dirk Hohndel]

The first-time contributors to the kernel are more diverse than they were a few years ago, Hohndel said, which is a step in the right direction. There is a problem using that as a measure, though, Torvalds said, because it is fairly easy to do one patch or a few trivial patches. Going from one patch to ten patches is a big jump. There are a lot of people who do something small, once, but that taking the next step is hard. Something like half of all the kernel contributors have only ever contributed one patch. That is probably healthy overall, but looking at first-time contributors may not be an indicator of the makeup of the actual development community.

An audience member pointed out that in addition to the low numbers of women at the conference, there was also a lack of college and high school students. Torvalds said that he didn't find that too surprising, as even those using or developing open source at school probably wouldn't attend a conference like LinuxCon. There is definitely a need for more women participating, though, so hopefully the outreach programs will help there. Hohndel mentioned the Outreach Program for Women, which has kernel internships funded by the Linux Foundation. Sarah Sharp of Intel has been overseeing the program, which has been "extremely successful" in getting applicants. Torvalds said that it had brought in over a hundred patches to the kernel from the applicants.

Education

Another audience member mentioned a university in Switzerland that uses open source software as part of its curriculum, guiding the students to the culture of open source, IRC channels, and the like. Torvalds said that there are other universities with similar programs, which is good. He pointed out that it is "not just about the kernel", which is a hard project to enter, but that other open source projects are good steppingstones to kernel development. Often times, a project needs help from the kernel, so that's how its participants start to get involved with the kernel.

In answer to a question about the differences in educational systems and whether there are specific advantages in learning technology, Torvalds noted that he had first-hand experience with Finland's system and second-hand with the US system through his kids. Finland makes an excellent example, he said, because there is a lot of technology that comes out of a fairly small country "in the middle of nowhere". It has a population of five million and there are cities in Japan that are bigger. In Finland, education is free, so that students don't have to worry about how to pay for it. That means that they can "play around" some rather than just focusing on school work.

Torvalds spent eight and a half years at his university in Finland, and only came away with a Masters degree. In some sense, that's not very efficient, he said, but he worked on Linux during that time. Finland's system gives people the freedom to take risks while they are attending school, which is important. Some will take that freedom to just drink and party, but some will do something risky and worthwhile. In the US, he can't imagine someone going to a good university for that long, because they can't afford to do so. Finland is not necessarily so strict about getting people to graduate in four years, which gives it a "huge advantage" because of that openness and freedom.

Contributing

A non-developer in the audience asked about how he could make a contribution to the kernel. Torvalds was emphatic that you don't need to be a developer to contribute. Part of the beauty of open source is that people can do what it is they are interested in. When he started working on Linux, he had no interest in doing the things needed to turn it into a product. There is documentation, marketing, support, and so on that need to be done to make a product, but he only wanted to write code. He left it to others to do the work needed for making it a product.

Depending on one's interests there are countless things that can be done to contribute. Translating documentation or modifying a distribution so that it works better for Japanese people are two possibilities that he mentioned. Beyond that, Hohndel said testing is a huge part of the process. Running kernels during the development cycle and reporting bugs is a crucial part of making Linux better. But it is not just about the kernel, he said. When Torvalds is on stage the conversation naturally drifts in that direction, but Hohndel said that there are tens of thousands of open source projects out there. People can document, translate, and test those programs. There are a "ton of opportunities" to contribute.

For example, Torvalds and Hohndel work on Subsurface, which is a graphical dive log tool. Torvalds said that it involves lots of graphical user interface (GUI) work that he has "no interest in at all". A GUI designer, even one who can't write code, would be welcome. Creating mockups of the interface that someone else could write the code for would be very useful. Of course, "real men write kernel code", he said. Hohndel chided him by noting that statements like that might be part of the reason why there aren't more women involved. A somewhat chagrined Torvalds agreed: "real men and women write kernel code".

The future

Another question concerned non-volatile main memory and how Torvalds thought that would change computers. First off, there has been talk about non-volatile main memory for a long time and it's always just a few years off, Torvalds said, so he is skeptical that we will see it any time soon. But if it is finally truly coming, he thinks its biggest impact will be on filesystems. For many years, we have been making filesystems based on the block layout of disks, but non-volatile memory would mean that we get byte addressability for storage. That would allow us to get away from block-based organization for filesystems.

For main memory, even if it is all non-volatile, he thinks working memory will still be treated as RAM is today. Non-volatile memory will make things like suspending much easier, but there won't be a lot of externally visible changes. There will still be a notion of long-term storage in files and databases. But the internals of the kernel will change. It will take a long time before we see any changes, even if the hardware is available soon, Torvalds said. It takes a good bit of time before any new technology becomes widespread and common. There is "huge inertia" with existing hardware; people tend to think technology moves faster than it actually does.

What did he think operating systems would look like in 20-30 years was another audience question. "Not much different" was Torvalds's answer. Today's operating systems are fairly similar to those of 40 years ago, at least conceptually. They are much bigger and more complicated, but the basic concepts are the same. If you looked at an operating systems book from the 1970s, today's operating systems would be recognizable in it. The outward appearance has changed, GUIs are much different than the interfaces on those older systems, but much of the rest is the same. "Files are still files", we can just store more of them. There won't be that much change because "most of what we do, we do for good reasons", so throwing that away makes little sense.

Another recent trend is wearable computing, as typified by Google Glass, Hohndel said. What did Torvalds think of that? The idea of a small screen that is always there is attractive to him, Torvalds said, but the problem is not on the output side, it is the input part that is difficult. He hates writing email on his cell phone, so he would love to see some kind of "Google Glassy thing" for input. He hates voice recognition; perhaps it would work for writing a letter or something, but you can't edit source code that way. "Up up up up up" just doesn't work. Maybe someone in the audience will figure out a better way, he said.

The privacy implications of Google Glass don't really bother Torvalds. He said that others (like Hohndel) are much more protective of their privacy than he is. "My life is not that interesting", Torvalds said, and doesn't need to be that private. "All of the interesting things I do, I want to put out there", so there are just a few things (like his bank password) that he cares about protecting. While people are unhappy with Google Glass because it can record anything the person sees, it is something people could already do without Glass, so it's "not a big issue" to him.

Young and stupid

"I was young, I was stupid, and I did not know how much work it would be". That's how Torvalds started his answer to a question about his inspiration to write Linux. He wanted an operating system, and was even willing to buy one, but there was nothing affordable available that he wanted to run. Some are religious about open source, but he is not, and would have bought something if it was available. So he started writing Linux and made it open source because he wanted to "show off" and didn't want to sell it.

Torvalds knew that he was a good programmer, but he also knew he was not a good businessman. If Linux hadn't attracted other people right away, the project probably would have died within six months. But the involvement of others was a "huge motivating factor" for him to continue, because it "made it much more fun". Some people ask him if he regrets making Linux open source, because he could be as rich as Bill Gates today if he hadn't. It's a kind of nonsensical question, because Linux wouldn't be where it is today if he hadn't opened it up, so he wouldn't have reached Gates-level wealth even if he had kept it closed. But there is "no question in my mind" that making Linux open source was the right thing to do.

He has never had a plan for Linux, it already did what he wanted it to do in 1991. But many others in the Linux community did have plans for Linux and those plans were quite different. Some were interested in small devices and cell phones, others in putting it into gas pumps, some wanted to use it in space, still others for rendering computer graphics. All of those different ideas helped shape Linux into what it is today. It is a better way to develop a stable operating system, Torvalds said. When a single company has a plan for their operating system, which changes with some regularity, it destabilizes the system. Linux on the other hand has many companies who know where they want it to go, which has a tendency to keep Linux stable.

[I would like to thank the Linux Foundation for travel assistance to attend the Automotive Linux Summit Spring and LinuxCon Japan.]


Index entries for this article
ConferenceLinuxCon Japan/2013


(Log in to post comments)

LCJ: The Linus and Dirk show

Posted May 31, 2013 2:17 UTC (Fri) by xanni (subscriber, #361) [Link]

> "Files are still files", we can just store more of them. There won't be that much change because "most of what we do, we do for good reasons"

I'm not at all convinced. I believe that a lot of the "file" model is historic accident, tradition and inertia. Everyone has learned and promulgates that particular mental model, but it's by no means the only possible or necessarily the most useful way to think about documents.

LCJ: The Linus and Dirk show

Posted May 31, 2013 3:35 UTC (Fri) by dlang (guest, #313) [Link]

It doesn't matter if you call them 'documents' or 'files', you are still going to be organizing your data into something that can be called by either name.

LCJ: The Linus and Dirk show

Posted May 31, 2013 9:01 UTC (Fri) by Lennie (subscriber, #49641) [Link]

I can at least see some changes in how we use things these days.

A web URL used to be files too. They had .html as an extension. HTTP still delivers blobs with a mime-type. But the blobs delivered don't need to have a length anymore. The namespace is still similar.

Maybe an URL starts to look more and more like streams, as parts of the page are updated on the fly.

You'll always need a namespace, in the case of the web the URL is the namespace. For filesystems directories and files are the namespace.

NTFS also has streams (no idea how it works, what it does or if they are useful, I don't think it gets a lot of use).

If I'm not mistaken Mac OS X and BeOS record a mime-type with the file.

Are these trends ? I don't really know.

Anything new would probably need to have snapshot support, that might be the only way you could work with streams.

I do see a trend where people talk more and more about erasure encoding.

If you look at Lustre and Ceph for example, they talk about objects not files.

With the use of VMs you see a lot of use of blockdevices these days.

Both Ceph objects and those VMs are still stored in files currently.

The question is, is this just for historic reasons; Would a clean slate end up at the same solution ?

I think last time this topic came up on lwn I it was about the API.

If there is no new API, I doubt much will change.

LCJ: The Linus and Dirk show

Posted May 31, 2013 9:44 UTC (Fri) by khim (subscriber, #9252) [Link]

NTFS also has streams (no idea how it works, what it does or if they are useful, I don't think it gets a lot of use).

They are used to keep some metainformation around in MS Office (and of course malware uses them a lot) but initially they were develop by architecture astronauts to extend MacOS model of two "forks". Model which MacOS itself abandoned novadays.

If I'm not mistaken Mac OS X and BeOS record a mime-type with the file.

Not Mac OS X, no. Mac OS Classic did that. They even had a central repo of all "mime-types". Turned out to be a bad idea.

Are these trends ? I don't really know.

See above. Trend goes in other direction, actually. Once upon time our handhelds had no filesystem "because it does not work for PDA" (I mean PalOS here). Today we use Android, iOS and some other devices - and they all support files at some level.

Files are becoming scarce in the interface, that's true, but all tries to remove them from the plumbind layer ended up a failures.

The question is, is this just for historic reasons; Would a clean slate end up at the same solution ?

Nobody knows. So far all tries ended in the same fashion: when people are faced with the need to reuse old software they either resurrect files on a new platform or platform itself is killed and replaced with other where files are supported. We can not do a clean slate restart!

If there is no new API, I doubt much will change.

And if there is no old API it'll not change anything at all. Either people will eventually emulate old API on top of the new one (and the next logical step it to promote emulation layer to the status of first-class citizen - this process happens right now, today with all these new-fanged HTML-based platforms) or they will abandon the platform.

LCJ: The Linus and Dirk show

Posted Jun 5, 2013 3:59 UTC (Wed) by lambda (subscriber, #40735) [Link]

Model which MacOS itself abandoned novadays.

Ha! I wish. Storing (and managing) resource forks on Linux filesystems which don't support them has been the bane of my existence over the past year. It doesn't help that every single implementation of a layer that translates from Mac filesystem semantics uses a different convention for storing them (resource.frk/filename, ._filename, .AppleDouble/filename).

Not Mac OS X, no. Mac OS Classic did that. They even had a central repo of all "mime-types". Turned out to be a bad idea.

Bad idea? It actually worked out quite well; the only problem was, other systems didn't use them, they either used file extensions, content sniffing, or other metadata, so again, like the resource forks I mention above, they became too much of a pain to deal with once networking with other systems became widespread.

But these aren't problems of the feature itself; just the lack of standardization on other systems.

LCJ: The Linus and Dirk show

Posted May 31, 2013 19:59 UTC (Fri) by jengelh (subscriber, #33263) [Link]

>A web URL used to be files too.

And now? Could be some CGI that generates data. But then, so can filesystems (and FUSE has spawned quite a lot of them).

>If you look at Lustre and Ceph for example, they talk about objects not files.

And then there is gluster, which does not do objects, but files. The upshot? You can just get to your data without having to collect the puzzle pieces that are objects.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 4:24 UTC (Mon) by JoeF (guest, #4486) [Link]

A web URL used to be files too.

No, it never was. A URI is a resource identifier. The resource can be anything. It _can_ be a file, but *never* had to be. A resource with a mimetype text/html never had to have a .html extension, either. File extensions are meaningless in HTTP. The mimetype matters.

NTFS also has streams (no idea how it works, what it does or if they are useful, I don't think it gets a lot of use).

That's known as ADS, Alternate Data Streams. Accessed with normal file operations, with the name as filename:streamname.
It's not used much, but I have seen it occasionally. It doesn't survive copying to FAT filesystems, e.g, on USB sticks, so the usefulness is very limited.

LCJ: The Linus and Dirk show

Posted May 31, 2013 9:52 UTC (Fri) by xanni (subscriber, #361) [Link]

The fact that you see the two as interchangeable demonstrates my point about accepting the dominant paradigm without considering alternatives.

A document in Xanadu terminology is any collection of information intended by the author(s) to be regarded as a single work, and may comprise multiple identifiers, authors, versions, media types and instances; content may participate in any number of documents simultaneously, though not necessarily in the same arrangement. This can be stored as "files", disk blocks, database records etc. but that is an implementation issue and certainly not a one-to-one relationship.

LCJ: The Linus and Dirk show

Posted May 31, 2013 13:38 UTC (Fri) by dlang (guest, #313) [Link]

I see the items that you are describing as being collections of files or documents. the fact that Xanadu opted to use the term 'document' to refer to this collection is just overloading the term.

Having everything be a mish-mash of chunks of data from many sources sounds good in theory, but in practice how does it work when you have a second system involved? especially if you don't guarantee always-on connectivity?

LCJ: The Linus and Dirk show

Posted May 31, 2013 14:23 UTC (Fri) by xanni (subscriber, #361) [Link]

> I see the items that you are describing as being collections of files or documents. the fact that Xanadu opted to use the term 'document' to refer to this collection is just overloading the term.

The point is that it's valuable to re-examine things from a different perspective. When you focus on the low-level storage rather than what's important to the author, you're making the author adjust to the historical baggage of arbitrary design choices. And if you choose to use the word 'document' as a synonym for 'file', you're losing an important distinction.

> Having everything be a mish-mash of chunks of data from many sources sounds good in theory, but in practice how does it work when you have a second system involved? especially if you don't guarantee always-on connectivity?

Local copies.

LCJ: The Linus and Dirk show

Posted May 31, 2013 15:18 UTC (Fri) by dlang (guest, #313) [Link]

I don't care if the data is stored a a "file" on a filesystem, or as a entry in a database.

I'm saying that your "documents" are really collection because they contain many different types of data that get used and created in many different tools, and they get changed independently of each other.

the "file" is not a matter of how it's stored on disk, it's a matter of how it's created, manipulated and used.

I don't care if your "document" has images, audio, video, and text mixed in it. you may view it that way (just as web browsers have presented all these types to users as one "page"), the different components are separate conceptual entities that are later linked together. These fundamental conceptual entities are what are referred to as "files". It doesn't matter if they are stored on a traditional filesystem, or if they use a SQL database as their "filesystem"

After all, both a filesystem and a database are ways to store data and retrieve it later based on a variety of criteria.

you see me as refusing to look at things in a new way, I see you as being distracted by implementation details that don't change the fundamental behaviour and usage.

LCJ: The Linus and Dirk show

Posted Jun 2, 2013 5:31 UTC (Sun) by xanni (subscriber, #361) [Link]

And you're still missing my point that there can be new types of fundamental behaviour and usage if you change the implementation and the API that it offers. The Xanadu design allows recomposition and reuse of any arbitrary "slice" in however many dimensions the data is represented, so there are no "fundamental conceptual entities referred to as files" - that's the whole point. The underlying data is an ever-growing collection of all kinds of media which only has boundaries and labels applied after the fact. The "fundamental conceptual entity" in Xanadu is a tree of byte streams which doesn't really map to "files" at all; the layer at which we map to "files" is the document, as an API on top of a different internal implementation.

LCJ: The Linus and Dirk show

Posted Jun 6, 2013 6:09 UTC (Thu) by jbv (guest, #66170) [Link]

Isn't Xanadu a lot like LaTeX documents with \include{doc}, \figure{...} etc. ?

LCJ: The Linus and Dirk show

Posted Jun 6, 2013 6:31 UTC (Thu) by xanni (subscriber, #361) [Link]

Not particularly, no. What gave you that impression?

LCJ: The Linus and Dirk show

Posted May 31, 2013 9:47 UTC (Fri) by ibukanov (subscriber, #3942) [Link]

There is an observation that the distribution of ideas life-span follows a power law. If an idea survived X years than, in the absence of other knowledge, the best guess is that the idea will still be relevant in another X years.

So given that the notion of a file in filesystem has been used for at least 40 years then the best guess is that in 40 years files will still be relevant. Another example is a tablet computer. Their mass production has began about 3 years ago so at this point we can only expect that they can survive only 3 more years as a popular device as some yet unknown tech may replace them in due time.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 5:31 UTC (Mon) by deepfire (guest, #26138) [Link]

Of course, the trick is what you would allow yourself to call "other knowledge".

What if I told you that "people like tablets", or "people are rarely satisfied by The Only Hierarchy imposed by the filesystem, when managing their personal files"?

Would these kinds of "knowledge" count?

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 12:36 UTC (Mon) by ibukanov (subscriber, #3942) [Link]

Typically things that kill ideas are subtle and cannot be reduced to just one simple explanation. It is not that people suddenly have started to like tablets. Rather many people have started to perceive tablets as a better alternative to other computing devices. There are many reasons for that.

Similarly, it may well be that people are not satisfied with the single file hierarchy, it is just that other alternatives are worse at least in perception.

LCJ: The Linus and Dirk show

Posted May 31, 2013 12:09 UTC (Fri) by renox (guest, #23785) [Link]

> I'm not at all convinced. I believe that a lot of the "file" model is historic accident, tradition and inertia.

This part is correct.

> [cut] but it's by no means the only possible or necessarily the most useful way to think about documents.

This part is very wrong: "files" are mostly an API in fact which is used to access many, many things which are not document.
Have a look at Plan9 sometimes, it goes even further with using "files" for everything.

LCJ: The Linus and Dirk show

Posted May 31, 2013 12:18 UTC (Fri) by xanni (subscriber, #361) [Link]

> This part is very wrong: "files" are mostly an API in fact which is used to access many, many things which are not document.
> Have a look at Plan9 sometimes, it goes even further with using "files" for everything.

I'm well aware of that; I think you're agreeing with me from a different direction. Not only are "files" a useful API which can access many things which are not documents, but my point is that documents can be described in many ways which are not easily mapped to the file API.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 0:53 UTC (Sat) by ssmith32 (subscriber, #72404) [Link]

> that documents can be described in many ways which are not easily mapped to the file API.

Which does not imply that the file API will go away at all.

The file API is a useful set of primitive operations that will likely remain useful for two reasons:

- from the user's (application developer's) side, it provides just enough flexibility to build higher-level sets of primitives (like your document API) on top of.

- from the (kernel) developer's side, it provides a small enough set of primitive operations that can be cleanly reasoned about and therefore solid guarantees can be then given to the user.

Any more complex it becomes either too specialized (either too develop, or it can be developed, but only by making the guarantees that one particular set of users care about, like the users creating a document API).

Any simpler, and you can't build much of anything useful out of it.

So, no, I don't see any convincing argument here that the file API will disappear any time soon - at least from the kernel's perspective, which is the perspective of relevance for Linus, and therefore, this article :)

LCJ: The Linus and Dirk show

Posted Jun 2, 2013 5:36 UTC (Sun) by xanni (subscriber, #361) [Link]

No disagreement there. But I still stand by my view that files are not the only or necessarily the best possible concept; they're a local maximum in the design space. So Linus is correct that the reasons are "good" from the perspective of compatibility, but not from the perspective of finding the most powerful representation for information.

LCJ: The Linus and Dirk show

Posted May 31, 2013 12:35 UTC (Fri) by oever (guest, #987) [Link]

A promising idea that is gaining traction but very slowly is to use content addressable storage (CAS) like git and magnet uris. If we want to share a file or link to information we currently mainly use urls that do not guarantee anything about the data that is reachable at that url. Many webpages change constantly. By using CAS, the agreement on what resource is being talked about is not ambiguous.

The same is true on the local file system which uses descriptive strings to refer to files. The files that the string refers to may change between consecutive accesses. There is a locking mechanism for individual files, but not for entire directories, short of mounting it read-only.

Semantic data is also slowly gaining traction e.g. in wikidata. When attaching information to files, or more precisely, byte arrays, the uses of e.g. a sha1 to identify the data makes it possible to combine information without needing to understand the folder structure.

One thing that would really help the use of such technologies on Linux would be the caching of the checksums of files on the filesystem level so that the sha1 can be invalidated when the file changes and recalculated quickly. To make this more efficient, a checksum that is checksum over ever larger blocks (1k, 8k, 64k etc) like a tree in git is a checksum over the checksums of the contents could make this more efficient.

An API that returns the checksum of a file would be really helpful and might fix tools like 'make' so they do not work with hacky concepts like mtime comparison.

[1] http://meta.wikimedia.org/wiki/Wikidata

LCJ: The Linus and Dirk show

Posted May 31, 2013 15:25 UTC (Fri) by dlang (guest, #313) [Link]

> If we want to share a file or link to information we currently mainly use urls that do not guarantee anything about the data that is reachable at that url.

a filename and path doesn't guarantee anything about the data that is reachable at that location either, how does this fundamentally change anything?

you can't ditch files and refer to everything by a sha1 hash because you will have collisions where different content has the same hash value.

Remember that a "strong hash" doesn't mean there are no collisions, it means that you cannot deliberately create two documents that collide.

In addition, referencing things by the exact content means that if you make any change, you have to then update every place that references that content. While there are some times and places where this makes sense, most of the time it doesn't.

plus there's the minor issue that humans need to access the data, and humans are not going to remember sha1 hashes, but they can remember filenames and paths.

LCJ: The Linus and Dirk show

Posted May 31, 2013 20:23 UTC (Fri) by ibukanov (subscriber, #3942) [Link]

> you can't ditch files and refer to everything by a sha1 hash because you will have collisions where different content has the same hash value.

For practival purposes sha-1 is collision-free. Surely the probability of a collission is not zero, but a probability of violation of the second law of thermodinamics is also non-zero, yet strangely violations has not been observed.

LCJ: The Linus and Dirk show

Posted May 31, 2013 21:30 UTC (Fri) by apoelstra (subscriber, #75205) [Link]

> For practival purposes sha-1 is collision-free. Surely the probability of a collission is not zero, but a probability of violation of the second law of thermodinamics is also non-zero, yet strangely violations has not been observed.

The birthday paradox says that if you have N possible values, a collision is practically guaranteed to happen around sqrt(N) samples. (The probability jumps from 0.1% to 99.9% over a(n IMHO) surprisingly small range around sqrt(N).)

So if you've got 2^128 or 10^40 objects around, you'll start to see collisions. This is an enormous number, but easily we will see so many things being thrown around the Internet this century.

To contrast, to reverse the laws of thermodynamics in a noticeable way typically requires on the order of 2^(10^23) attempts. Not ever going to happen in this universe.

LCJ: The Linus and Dirk show

Posted May 31, 2013 23:16 UTC (Fri) by dskoll (subscriber, #1630) [Link]

So if you've got 2^128 or 10^40 objects around, you'll start to see collisions

SHA1 is only 160 bits so the birthday paradox kicks in at 2^80 or 1.2 * 10^24.

LCJ: The Linus and Dirk show

Posted May 31, 2013 23:41 UTC (Fri) by dlang (guest, #313) [Link]

There are researchers who are gathering large groups of files who can point you at two files that generate the same SHA1 today.

This isn't news because it's expected, the strength is that you can't create two documents with the same hash like you can with MD5, but even MD5 is only considered 'partially' broken, because you cannot create a new document that matches the hash of an existing one.

Also, remember that the birthday paradox is the level at which there is a 50% chance of there being a collision.

If you are relying on SHA1 for your entire identification, there will be serious problems for people long before you hit that 50% probability level

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 18:17 UTC (Mon) by jimparis (guest, #38647) [Link]

> There are researchers who are gathering large groups of files who can point you at two files that generate the same SHA1 today.

Really? I thought the best possible attacks on a non-reduced SHA1 were still around 2^61 operations today and that no collisions have yet been found. Known MD5 collisions have been around since 2005 by comparison.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 18:28 UTC (Mon) by dlang (guest, #313) [Link]

I didn't say that they had created collisions, I said that they have discovered collisions.

There's a huge difference between the two.

discovering collisions is expected, a hash is partially broken if someone can manipulate two files to have them generate the same hash, and completely broken if someone can create a new file that matches the hash of a specific target file.

I don't have a source I can point you to, this was something I heard in a security class I was taking from the instructor.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 18:36 UTC (Mon) by jimparis (guest, #38647) [Link]

> I didn't say that they had created collisions, I said that they have discovered collisions.

I still don't think that is true. Regardless of whether they were generated or merely discovered, I do not believe that there have ever been two different strings of data that have been identified as hashing to the same SHA-1 value. No known collisions exist.

LCJ: The Linus and Dirk show

Posted Jun 6, 2013 15:11 UTC (Thu) by nye (guest, #51576) [Link]

>I do not believe that there have ever been two different strings of data that have been identified as hashing to the same SHA-1 value. No known collisions exist.

I think there have been collisions demonstrated in SHA-1 with a drastically reduced number of rounds, which is not the same thing at all but is probably what dlang's instructor was talking about.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 8:24 UTC (Sat) by ibukanov (subscriber, #3942) [Link]

The birthday paradox is only practically relevant if one is going to surfer consequences. With 160 bits in SHA-1 that means that one may miss one file per each 10^24 files. Even if in some distant future one could have capacity for that (the bandwidth projections implies that this is not happen in this century), I cannot see why such miss could be more important than a probability of 10^-24 of missing a file in a git tree due to a collision now. So it is not the number of bits but rather the cryptographic weakness that eventually will kill sha-1 when adversary may create a file with the same hash.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 16:08 UTC (Sat) by dlang (guest, #313) [Link]

> I cannot see why such miss could be more important than a probability of 10^-24 of missing a file in a git tree due to a collision now

One big difference is that you don't put all your files into a single git tree, so with git the address is <repository> + <hash>

The point isn't the eventual breakage of SHA1 but rather than if you were to try to address all content in existence by a hash instead of by URL/file/path type addressing, you ARE going to have collisions

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 17:08 UTC (Sat) by ibukanov (subscriber, #3942) [Link]

If a probability of a collision is, say, 15 orders less than a possibility of hardware failure, then it has no practical consequences.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 17:20 UTC (Sat) by dlang (guest, #313) [Link]

> If a probability of a collision is, say, 15 orders less than a possibility of hardware failure, then it has no practical consequences.

Only if you discount the possibility that the content may exist elsewhere (on a backup or mirror for example)

which brings up an interesting point, if you are addressing everything by it's hash, how do you track that you have more than one copy of it?

git doesn't have this problem because it only does hash based access within the one repository, so it's perfectly able to have additional copies in other repositories.

But if you are making your hash based access global, you no longer have that capability.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 8:55 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

So we can just switch to 512-bit hash. That'd require on the order of 2^256 tries, which is not really possible without expending the energy output of a typical supernova.

LCJ: The Linus and Dirk show

Posted Jun 1, 2013 14:55 UTC (Sat) by dskoll (subscriber, #1630) [Link]

Content-addressable storage is terrific for computers; I love git and the way it works. But it sucks for humans who instinctively give things meaningful names.

When I make a release of a software package, I tag it in git. That is, I give the release a name.

Users of my software understand perfectly what "Release 9.0.7" means. I'm not so sure they'd be happy if we said they should upgrade from 8480bfdc2481f7aa4cd93c97a72d91ef2299861c to 55ced05a166d743634bcadef19029d36423f0ade.

8480bfdc[...]9861c to 55ced05[...]23f0ade

Posted Jun 1, 2013 21:18 UTC (Sat) by Max.Hyre (subscriber, #1054) [Link]

That's why UUIDs were such a good idea for identifying disk partitions—not.

8480bfdc[...]9861c to 55ced05[...]23f0ade

Posted Jun 6, 2013 2:12 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Well, it is a good idea for /etc/fstab, boot paths, and /etc/automount rules. /dev/disk/by-label/ is where I go for one-off disk access. I certainly don't think the UUID should be shown in any UI other than maybe an info tab/dump on the partition/disk.

LCJ: The Linus and Dirk show

Posted May 31, 2013 23:19 UTC (Fri) by dskoll (subscriber, #1630) [Link]

uses of e.g. a sha1 to identify the data makes it possible to combine information without needing to understand the folder structure.

Ah, yes. I would much prefer working with something called 7d9153e56f4b98d976b2b1f2eefeb3526106eedf rather than Business/Financials/2013-04/invoices.ods. Much clearer.

LCJ: The Linus and Dirk show

Posted May 31, 2013 19:06 UTC (Fri) by dskoll (subscriber, #1630) [Link]

I believe that a lot of the "file" model is historic accident, tradition and inertia.

I disagree. Since prehistoric times, people have given things names. It's totally natural to take a blob of data and give it a name so later on you can find it by name. That's why I might take this month's sales figures and save them to a spreadsheet called "2013-05-sales".

People have proposed other things like tagging of data and dynamic searches, but that doesn't mirror human psychology very well IMO. My kids call me "dad", not "someone with the male, middle-aged, bearded, and glasses-wearing tags" because it's natural to refer to something by name, not by a list of its attributes.

Tags are fine for searching when you don't know the name of something, but in my experience I know things by name about 98% of the time and only have to search maybe 2% of the time.

LCJ: The Linus and Dirk show

Posted Jun 2, 2013 5:43 UTC (Sun) by xanni (subscriber, #361) [Link]

I never said we shouldn't have a naming system! However the idea of applying heirarchially organised names only to chunks of data that are stored as a single "file", rather than to arbitrary and overlapping arrangements of data is a historic limitation. The Xanadu designs permit a more powerful naming system, not less.

LCJ: The Linus and Dirk show

Posted Jun 2, 2013 20:14 UTC (Sun) by dskoll (subscriber, #1630) [Link]

However the idea of applying heirarchially organised names only to chunks of data that are stored as a single "file", rather than to arbitrary and overlapping arrangements of data is a historic limitation.

I don't agree. I think it's a direct consequence of human psychology. Our brains are very good at categorization and very good at giving names to objects. Our brains are not that good at coping with names for "arbitrary and overlapping arrangements of data" which is why such schemes have had very limited uptake.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 5:34 UTC (Mon) by deepfire (guest, #26138) [Link]

> Our brains are very good at categorization

Exactly. And by imposing a single hierarchy, you are stiffling this particular power.

LCJ: The Linus and Dirk show

Posted Jun 3, 2013 8:24 UTC (Mon) by dskoll (subscriber, #1630) [Link]

Exactly. And by imposing a single hierarchy, you are stiffling this particular power.

No, I disagree. Our brains are very good at putting objects into a category. We are terrible at juggling multiple categories all at once, which is why "tag clouds" and the like are not useful for remembering where something is stored. They're somewhat useful for searching or for looking for objects in a particular category, but not for serving as the canonical way to get at something.

LCJ: The Linus and Dirk show

Posted Jun 11, 2013 3:17 UTC (Tue) by fest3er (guest, #60379) [Link]

The human brain is very good at pattern matching. It is, in fact, a complex and dense pattern matcher. A filesystem is really, at its most basic level, a hierarchy of patterns. Once the human brain learns the patterns, things can be found quickly in any FS. File systems, GUIs, automobile controls, and traffic and pedestrian control systems all break down when their recognizable patterns change; they become unusable until the human brain learns the new patterns.

I challenge all of you to devise a new data storage system that can be efficiently navigated by both humans and machines.

LCJ: The Linus and Dirk show

Posted Jun 11, 2013 7:04 UTC (Tue) by oever (guest, #987) [Link]

Instead of having one hierarchy, one could have multiple hierarchies. Most file systems put files in categories, e.g. /home/me/work/projectA/{code,report}. Next to this, the same files could be made accessible by a time based hierarchy: /{mtime,ctime,atime}/{yesterday,last_month}/{morning,afternoon,evening} or /creator/{collaborators,friends,aliens,others}/. In other words, files could be indexed and made available with multi-faceted search.

LCJ: The Linus and Dirk show

Posted Jun 11, 2013 8:50 UTC (Tue) by viro (subscriber, #7872) [Link]

Oh, rapture... So a single operation can modify an unlimited subset of your "multiple hierarchies". Do tell, what kind of warranties are you going to make wrt these hierarchies being consistent with each other? What about atomicity, while we are at it?

LCJ: The Linus and Dirk show

Posted Jun 11, 2013 11:41 UTC (Tue) by oever (guest, #987) [Link]

No, not unlimited, current machines still have finite resources.

As to consistency warranties, with the current file systems, additional hierarchies cannot be updated instantly. The main hierarchy would need to be monitored. Since the additional hierarchies can depend on the contents of the files, a sane approach would be to have the entries disappear from the additional index while the file is being written to and have it possibly reappear after the updated file has been analyzed.

Such additional hierarchies could be implemented by having a daemon that creates hard links.

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 6:19 UTC (Wed) by viro (subscriber, #7872) [Link]

IOW, what you propose is inherently racy and needs a seriously different IO model. What does "file is being written to" mean on Unix? Something with versioned files would have matching notions, but that's not the way Unix (or NT, for that matter, despite its VMS heritage) is designed. AFAIK, no OS of that kind has survived, so there's nothing to port the userland for such a beast from. Well, there's VMS, but it's not exactly known for lively userland development... For Unix and its ilk there simply isn't an "OK, here's the new version of file" primitive.

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 8:07 UTC (Wed) by oever (guest, #987) [Link]

What does "file is being written to" mean on Unix?

If it wasn't so easy to 'lie' with the timestamps with utimensat() or

touch -d 1999-01-01 party
then one could use mtime to indicate the a file had been written to. The combination of path + mtime + size could be approximate method to identify contents.

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 10:05 UTC (Wed) by viro (subscriber, #7872) [Link]

Sigh... "Had been written" != "is being written". You proposed to have file disappear from those additional hierarchies of yours when it's being written to and putting it back once that's over. To have it done in a race-free way, you'd need to
* have some way to tell the moment when the file is about to be written to
* have it removed from hell knows how many trees between that moment and the first modification
* somehow detect the moment when modifications are over
* reinsert the file into all those trees when _that_ happens.

It's very different from detecting post-factum that file has been changed. And just about impossible with Unix-style API. If opening a file created a copy (possibly with lazy copy-on-write) and if there had been an explicit commit operation making that modified copy visible in the filesystem (either as new version of versioned file or replacing the original), that might be feasible, but it's not the API expected by the Unix (or Windows, or MacOS, etc.) userland. There used to be operating systems of that sort, but they are long-dead. Sure, you can run something like TWENEX on an emulator; hell, VMS can even run on not-quite-dead hardware. But if you were to design an new OS along those lines, you'd better count on rewriting most of the userland from scratch - POSIX one expects a different IO API and VMS (let alone TWENEX) one is tied to shitloads of idiosyncrasies of those systems *and* is none too attractive, at that.

Or you can try to slap together a bunch of daemons, relying on vomit-inducing kludges like inotify and racy as hell (as anything idiotify-based is). Oh, and breaking on network filesystems. Usable only for file managers, and that only as long as no gnats fart in the vicinity...

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 11:03 UTC (Wed) by oever (guest, #987) [Link]

fanotify might be used for cases where race conditions are not an issue. If they are an issue, then all files should go in fuse fronted database since databases can give stronger guarantees. This would of course be slower than direct the normal file system.

The unix style API does not give strong guarantees, but it does not forbid them either. A file system that does offer strong guarantees that parallel hierarchies based on file contants are correct, would not mean that normal userland applications would not work.

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 12:02 UTC (Wed) by viro (subscriber, #7872) [Link]

fanotify does not (and cannot) work for network filesystems. It has other, er, deficiencies and as the rest of *notify family it's an architecture mistake that shouldn't be used at all.

As for the fuse-based "solutions"... Details, please. What would you do to existing openers of that file accessing it via secondary hierarchy? What do you do if somebody tries to open the same file while it's being modified, via the primary hierarchy? What do you do when somebody tries to open it for write via secondary?

I'm sorry, but saying "hocus-pocus, database, fuse, presto" is not enough to avoid the fundamental interface mismatch biting you in the arse.

I'm not saying that it's impossible to design a tolerable OS where such things would work; I'm not even saying that it wouldn't be interesting, but it's not an easy exercise and you won't get to reuse the existing userland.

LCJ: The Linus and Dirk show

Posted Jun 12, 2013 14:17 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

On my previous project we've used a hacked-up inotify wrapper that used ZeroMQ to send change notifications for shared network files. That's OK because file change notifications are racy, anyway.

> As for the fuse-based "solutions"... Details, please. What would you do to existing openers of that file accessing it via secondary hierarchy? What do you do if somebody tries to open the same file while it's being modified, via the primary hierarchy? What do you do when somebody tries to open it for write via secondary?
Use something like MVCC for metadata. I.e. you can create your own view which will remain consistent until you're finished working with it. Of course, you'll be getting errors if you try to do conflicting changes withing taking care to get explicit locks.

It can even work with most of the existing userland. It's not like Unix userland file system manipulation is reliable and race-free, anyway.

As for existing implementations, Microsoft almost got it done during the early Vista era. But they made a couple of mistakes (like writing it in early versions of C#) so it got axed along with most of C#-based stuff in Vista.

LCJ: The Linus and Dirk show

Posted Jun 18, 2013 12:39 UTC (Tue) by nix (subscriber, #2304) [Link]

Or you can try to slap together a bunch of daemons, relying on vomit-inducing kludges like inotify and racy as hell (as anything idiotify-based is). Oh, and breaking on network filesystems. Usable only for file managers
Someone please tell the desktop people this. They seem to be relying on inotify for more and more, because 'nobody uses networks' or something.

LCJ: The Linus and Dirk show

Posted Jun 11, 2013 13:10 UTC (Tue) by raven667 (subscriber, #5198) [Link]

Isn't this already how most modern File::Open dialogs work, both showing the filesystem hierarchy and also most recently used or modified files, even across directories, by using the file change notification kernel feature and persistent systemwide indexing.

LCJ: The Linus and Dirk show

Posted Jun 7, 2013 4:25 UTC (Fri) by kevinm (guest, #69913) [Link]

I don't disagree - I personally find the single-hierarchy method works well - but in your example from above ("Business/Financials/2013-04/invoices.ods") isn't most of the name a list of attributes?

I personally think that the reason the single-hierarchy method works is that it is an analogue of the physical world, where an object has one definite location. We've spent our lives learning about a world where the knife is in the drawer, in the kitchen, on the ground floor, in the house at number 42, on Sycamore street...

Cultural diversity

Posted May 31, 2013 10:37 UTC (Fri) by NAR (subscriber, #1313) [Link]

Hohndel noted that Kernel Summit pictures tend to contain only white males

I might be flamed to death, but I can't stop showing this group picture from a midwifery conference. Has anybody ever heard complaints about the lack of gender-diversity in that profession? I mean even the name of the profession is blatantly sexist...

Cultural diversity

Posted May 31, 2013 11:16 UTC (Fri) by xanni (subscriber, #361) [Link]

How on earth is that relevant?

Cultural diversity

Posted May 31, 2013 12:39 UTC (Fri) by petkan (subscriber, #54713) [Link]

It is, and it is not even a subtle relevance. Men and women are different. The differences are getting bigger the deeper you go from the surface. Genetically and culturally. Memes both genders begin to absorb right after their birth are different. Trying to equalize men to women (and vice verse) is plain stupid. To this date i fail to understand this hypocritical approach toward the topic.

Real or virtual, men write 99.99% of the kernel code, not (as we can see from the history so far) - women. To me, trying to change this artificially seems to be a waste of time. Let the nature run it's course, i'd say.

Cultural diversity

Posted May 31, 2013 13:24 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

If the words "men and women" in your first paragraph were replaced with "whites and blacks", it wouldn't look that out of place on Stormfront.

Just sayin'.

Cultural diversity

Posted May 31, 2013 13:37 UTC (Fri) by smitty_one_each (subscriber, #28989) [Link]

I must yawn at your "Just sayin'".
The Pavlovian need to scratch a topic and see racism/sexism/class warfare/injustice in everything under the sun has [extended rant elided out of respect for the venue.]

Cultural diversity

Posted Jun 2, 2013 0:33 UTC (Sun) by k8to (guest, #15413) [Link]

Soldier on, good denialist.

Cultural diversity

Posted Jun 2, 2013 11:01 UTC (Sun) by smitty_one_each (subscriber, #28989) [Link]

Denial of what? A photograph is proffered, an assertion is made.
Unless there is substantial proof to back up the assertion has the weight of that which it asserts.
That is, OK, you can make a claim that so-and-so holds such-and-such an attitude, but if you haven't got email/video/confessions, the, so what?
I, for one, am committed to laughing at anyone claiming I'm racist just because you can do some math and find that many of my circles have a preponderance of European extraction. Big deal. I've also served in the military, which is all diversity. Also, big deal.
This is not to dismiss entirely the possibility that some group IS exclusive.
Merely, again, laughing at the Racism/Sexism Industrial Complex, which has made a relatively destructive career out of pointing fingers with dubious justification.

Cultural diversity

Posted Jun 2, 2013 14:11 UTC (Sun) by k8to (guest, #15413) [Link]

If you can't recognize that "men and women are genetically different" isn't anywhere close to a valid explanation of the representation of men and women in careers and that people who are making this claim are off in crazy sexism-land, then I can't help you.

However, if you can't make that connection then I'd suggest you hold forth on the topic less, because you're remarkably poorly informed.

Cultural diversity

Posted Jun 2, 2013 16:25 UTC (Sun) by smitty_one_each (subscriber, #28989) [Link]

>If you can't recognize that "men and women are genetically different" isn't anywhere close to a valid explanation of the representation of men and women in careers and that people who are making this claim are off in crazy sexism-land, then I can't help you.

If you insist on a straw-manning a request for evidence in this manner, then I may not be the one in need of help.

>However, if you can't make that connection then I'd suggest you hold forth on the topic less, because you're remarkably poorly informed.

Strawman *and* a "'Shut up!' he explained." Good times.
k8to, this is but one data point, so I'll resist forming the opinion that you're one who argues in bad faith.

Cultural diversity

Posted Jun 3, 2013 0:55 UTC (Mon) by k8to (guest, #15413) [Link]

Your points sure would be valid if you weren't commenting in this exact thread where exactly those things were stated.

Cultural diversity

Posted Jun 3, 2013 10:21 UTC (Mon) by smitty_one_each (subscriber, #28989) [Link]

Declaring you the winner here, k8to.

Cultural diversity

Posted Jun 7, 2013 22:15 UTC (Fri) by Wol (subscriber, #4433) [Link]

I'll just chuck in the observation that the gender make-up of professions has changed dramatically over the years.

Even midwives! Go back several centuries and the job was the province of "amateur" women. As it became professionalised, it became almost totally male. Now it's back to almost totally women.

It's a historical accident most kernel developers are men. And one only has to read of the experience of women in the field to learn that many women who would like to be kernel developers are driven out by blatant sexism. (Two kinds, imho. The MCPs who assume that "women can't be any good" or who simply assume that any woman can be treated as a sex object - the repeated complaints about sexual assaults bear that out. And the other guys who insist on treating women as if they were blokes. Common courtesy says you treat people as individuals, and you respect them for who and what they are, you don't try and force them into your idea of the "ideal".) That said, there's no reason why we couldn't attract more women - IF WE RESPECT THEM AS *PEOPLE* - and they could bring a lot to the development.

Cheers,
Wol

Cultural diversity

Posted May 31, 2013 15:51 UTC (Fri) by vonbrand (guest, #4458) [Link]

No waste of time. Get women involved where there currently is none, and you suddenly have twice the manpower available. If you look around you, you'll soon see that most differences among men (and women) are much larger than the differences between the averages, so "men and women are different in <foo>" as a statement about averages might make sense, as a statement about random examples of each doesn't most of the time.

Cultural diversity

Posted May 31, 2013 16:41 UTC (Fri) by petkan (subscriber, #54713) [Link]

What do you mean by "get women involved"? Why should i/we make special effort to involve them? AFAIK the Linux kernel is not forbidden area to them anyway?

I personally don't care who made the patch/contribution so long it makes sense. Recently Sarah Sharp caught a bug in a driver i wrote long time ago. It was a bug alright and i fixed it as soon as i could. With the appropriate "thank you" note in the commit message. See, i didn't mention she's female.

"Twice the manpower" is a good thing when you need to dig a trench. I kind of doubt it is so when it comes to software development, or any other highly specialized field. If somebody wants to see more females at the Linux conferences i'd suggest different approach, not making them kernel developers first. ;-)

Cultural diversity

Posted May 31, 2013 20:00 UTC (Fri) by hummassa (guest, #307) [Link]

> AFAIK the Linux kernel is not forbidden area to them anyway?

They are deeply discouraged by patriarchalist thinking and discourse like yours.

They are deeply discouraged when, at age three, boys are given miniature remote control cars and building sets as gifts and girls are given barbie dolls.

They are deeply discouraged when they see that the popular girls in high school tend to "feminine" classes and activities.

I could go on and on. You are refusing to see that yes, we should especially encourage girls with maths and hard sciences and computers and programming and kernel programming because they are not being encouraged enough. And you are hiding behind the "girls are not as good at math as boys" fallacious argument.

Cultural diversity

Posted May 31, 2013 20:07 UTC (Fri) by raven667 (subscriber, #5198) [Link]

> And you are hiding behind the "girls are not as good at math as boys" fallacious argument.

Or the corollary Bugblatter Beast of Trall argument; if it doesn't obviously happen to me then I can't see it happens to someone else.

Cultural diversity

Posted May 31, 2013 20:45 UTC (Fri) by RobSeace (subscriber, #4435) [Link]

> They are deeply discouraged when, at age three, boys are given miniature
> remote control cars and building sets as gifts and girls are given barbie
> dolls.
>
> They are deeply discouraged when they see that the popular girls in high
> school tend to "feminine" classes and activities.

So, the solution involves a complete overhaul of all of society? I genuinely wish you luck with that!

But, I think the point was that the problem is not something that can really be solved by a bunch of male kernel developers... As cool as they may be, I don't think they have the power to dictate to parents all over the world to stop treating their daughters and sons differently, or order the popular girls in high school how to behave... The best they can realistically do is be accepting of any women who wish to contribute... And, of course they should do so! But, that's not really going to do anything to change how women have been socialized into thinking about what sorts of things they wish to do... They're not going to all suddenly say, "Oh, the kernel developers are welcoming women, now? Well, I'm going to take up C programming immediately!"... They still have to deal with all of the rest of society pressuring them to behave like "proper ladies"...

> And you are hiding behind the "girls are not as good at math as boys"
> fallacious argument.

I didn't get that impression at all from what he said... It was more like "girls choose not to do math as much as boys for whatever reason"...

Cultural diversity

Posted Jun 1, 2013 0:44 UTC (Sat) by hummassa (guest, #307) [Link]

> So, the solution involves a complete overhaul of all of society?

Any solution to our social problems do... ;-)

> I genuinely wish you luck with that!

I seriously and genuinely appreciate it.

> The best they can realistically do is be accepting of any women who wish to contribute... And, of course they should do so! But, that's not really going to do anything to change how women have been socialized into thinking about what sorts of things they wish to do...

Ok, that last part is where my opinion diverge. If they are accepting, or better yet, encouraging, of women in the CSs, in some decades there will be enough female role models so youngsters can emulate. At the present time, a 13-yo boy has Gates, Jobs, Shuttleworth, Torvalds, Joy, Cerf, etc. to look up to, if he decides he likes computing and computers as a means of living. But who would a girl emulate?

THAT is the way one generation has "suffragettes", the next has women voters, the following one has women candidates, and eventually you get a women head of government (like we have down here right now, and Germany, and Chile, and the UK had in the 80s, etc. etc.)

Cultural diversity

Posted Jun 1, 2013 13:03 UTC (Sat) by RobSeace (subscriber, #4435) [Link]

> If they are accepting, or better yet, encouraging, of women in the CSs, in
> some decades there will be enough female role models so youngsters can
> emulate.

Oh, I certainly agree we should definitely encourage women to take up CS... I'm just not entirely sure how to go about doing so, given all the already noted societal pressures on them to NOT do so... It's not like CS programs around the country now don't accept women or something... It's just most of them aren't interested in taking it up for whatever reason... Still, there are a handful that do, and enjoy it... I'd like to see more, but I don't know how to make that happen...

> At the present time, a 13-yo boy has Gates, Jobs, Shuttleworth, Torvalds,
> Joy, Cerf, etc. to look up to, if he decides he likes computing and
> computers as a means of living. But who would a girl emulate?

First, I wouldn't throw Jobs in that list; in a list of successful salesmen, maybe... Woz belongs on the techies list, not Jobs...

And, I'll certainly agree there aren't nearly as many highly visible female programmer role models... If I had to come up with any off the top of my head, all I could really name would be Ada Lovelace, Grace Hopper, Roberta Williams, and Valerie Aurora... (And, to be truthful, I only know the latter from reading LWN a lot...) But, yeah even the most famous of those (the long dead two) don't get the kind of mainstream media coverage that folks like Gates and Shuttleworth do... I would say it's society's worship of the filthy rich, but even Linus gets a fair amount of media coverage, and I think his wealth still pales in comparison to those sort...

Cultural diversity

Posted Jun 1, 2013 18:22 UTC (Sat) by Tobu (subscriber, #24111) [Link]

Personally I'm a fan of Fran Allen. Here's a sample of last year's Ada Lovelace day.

Cultural diversity

Posted Jun 1, 2013 18:30 UTC (Sat) by Tobu (subscriber, #24111) [Link]

(Google's date handling is broken, so you may have to try this link depending on your locale)

Cultural diversity

Posted Jun 2, 2013 11:43 UTC (Sun) by hummassa (guest, #307) [Link]

> I'm just not entirely sure how to go about doing so, given all the already noted societal pressures on them to NOT do so...

Affirmative action has being proven as a good way of doing that!

>> At the present time, a 13-yo boy has Gates, Jobs, Shuttleworth, Torvalds, Joy, Cerf, etc. to look up to, if he decides he likes computing and computers as a means of living. But who would a girl emulate?
> First, I wouldn't throw Jobs in that list; in a list of successful salesmen, maybe... Woz belongs on the techies list, not Jobs...

It wasn't meant as a list of techies, but as a list of influential and successful people in the world of computing. I said "if he decides he likes computing and computers as a means of living". Computer salespeople are computer people too; exceptional computer salespeople have the power to change the direction the industry is taking (for better and for worse).

Cultural diversity

Posted Jun 2, 2013 14:04 UTC (Sun) by raven667 (subscriber, #5198) [Link]

>> I'm just not entirely sure how to go about doing so, given all the already noted societal pressures on them to NOT do so...

>Affirmative action has being proven as a good way of doing that!

What is more interesting are the next steps after affirmative action. Affirmative action gives a shot of adrenaline to the patient but there are other tools, such as transitioning to mere advocacy without quotas. At what point is the healing self-sustaining?

Cultural diversity

Posted Jun 2, 2013 17:06 UTC (Sun) by dlang (guest, #313) [Link]

> Affirmative action gives a shot of adrenaline to the patient

and just like adrenaline, too much is harmful

Cultural diversity

Posted Jun 2, 2013 19:01 UTC (Sun) by hummassa (guest, #307) [Link]

> At what point is the healing self-sustaining?

Let's cross that bridge once we get to it. We are really far from it at the present moment... :-D Now, the moment is for affirmative action. Once you have 50,5% women kernel developers, mimicking the general population, you can get to the "keep it self-sustaining" part -- which is really, pretty easy because the status quo is usually self-sustaining anyway.

Cultural diversity

Posted Jun 2, 2013 20:29 UTC (Sun) by viro (subscriber, #7872) [Link]

Excuse me, but you are misusing statistics. Badly. Kernel developers are not drawn directly from general population. Anyone (male, female, whatever) coming to kernel development with zero math and programming background will come to serious grief for themselves _and_ for everyone around. And while I can imagine somebody inspired to learn math and programming by waking up one day and deciding that Linux kernel development is neat, I very much doubt that it's a common case. So we are looking at the population of people already attracted to math. And yes, it's a damn shame that in that population there are relatively few females. I don't believe that reasons are biological, BTW; keep in mind that this population is a minuscule part of the entire species - the vast majority is mathematically and logically illiterate and proud of that. Until that changes, we are going to be dealing with the fringe slivers of many (sub)cultures that didn't hit the norm (== illiteracy). The size of that fringe depends on the subculture in question and sadly it's smaller for the subcultures girls are usually pushed into. I'm not happy about that, but what the hell can I do, other than try to make sure that my daughter (bright ten years old) is _not_ brought up illiterate? And ready and willing to piss on the idiots' opinions about what she should be, of course - respect is earned, not deserved apriori.

Math and science education uniformly *fail* for the majority of children, regardless of the gender. Your 50.5% would mean that you'd managed to keep attractiveness of kernel development for already mathematically literate folks differing between males and females to the degree inverse to ratio of the abominally small success rates of general education. This is nowhere near a self-sustained situation...

Cultural diversity

Posted Jun 2, 2013 22:38 UTC (Sun) by hummassa (guest, #307) [Link]

> Excuse me, but you are misusing statistics. Badly.

I don't think I was *using* statistics, so I can't be misusing them IIRC.

> Kernel developers are not drawn directly from general population. Anyone (male, female, whatever) coming to kernel development with zero math and programming background will come to serious grief for themselves _and_ for everyone around. And while I can imagine somebody inspired to learn math and programming by waking up one day and deciding that Linux kernel development is neat, I very much doubt that it's a common case. So we are looking at the population of people already attracted to math. And yes, it's a damn shame that in that population there are relatively few females.

I think I can stop my quoting right there.

THAT is the plan:

Act affirmatively to attract females to maths, programming, kernel programming. Time passes. Continue affirmative action. More females succeed in maths, programming, and kernel programming. Time passes. Continue affirmative action. More females succeed *visibly* in maths, programming, kernel programming. This helps attract females to maths, programming, kernel programming. Continue affirmative action. Rinse, repeat until we have 50.5% (*) of people interested in math begin of female sex. Continue to repeat until 50.5% of people interested in programming are female. Continue to repeat util 50.5% of people invested in kernel programming are female. No affirmative action needed afterwards. We will have, by that time, our female versions of Torvalds, Gates, Jobs, &c.

The opposite plan:

Do nothing. Keep things like they are today. Some outliers (some as brilliant as Valerie Aurora) will enter kernel computing. Some can even become a recognized briliant computing-world-businesswoman. But they will stay the minority they are today. It is possible that none will ever have the projection that, for instance, Steve Jobs has. And that none will ever inspire lots of people to try and do things, like those I cited routinely do.

> I'm not happy about that, but what the hell can I do, other than try to make sure that my daughter (bright ten years old) is _not_ brought up illiterate?

Support affirmative action, so that the daughter of an illiterate guy (not you, you are obviously brilliant, and I mean it) can have the same chances your daughter has to develop her potential. You can do both, BTW.

> Math and science education uniformly *fail* for the majority of children, regardless of the gender. Your 50.5% would mean that you'd managed to keep attractiveness of kernel development for already mathematically literate folks differing between males and females to the degree inverse to ratio of the abominally small success rates of general education. This is nowhere near a self-sustained situation...

If we get to the goal, maybe that would not be case, don't you think? Or maybe at least we get people from other substrates of the society, with the diversity benefits implied by *that*...

(*) whatever the female percentage of the population is.

Cultural diversity

Posted Jun 7, 2013 22:22 UTC (Fri) by Wol (subscriber, #4433) [Link]

Bear in mind that - in the UK at least - we still have some segregated schools.

And pretty much ALL the schools at the top of the league in maths are single sex, and more to the point they are GIRLS schools. I think a recent list contained only ONE boys school in the top ten.

So if girls are better than boys at maths (when encouraged), as this seems to show, why aren't there more of them in the mathematical professions?

Cheers,
Wol

Cultural diversity

Posted Jun 10, 2013 16:06 UTC (Mon) by nye (guest, #51576) [Link]

>So if girls are better than boys at maths (when encouraged), as this seems to show, why aren't there more of them in the mathematical professions?

Girls are better at everything, across the board, so this statistic doesn't really tell you what you think it's telling you.

Cultural diversity

Posted Jun 3, 2013 4:25 UTC (Mon) by raven667 (subscriber, #5198) [Link]

Heck, I'd declare success if 20% of STEM graduates were female and if the percentage of female kernel developers matched the percentage of qualified female developers in industry. I listened to an interesting anecdote from a training company at a conference last year, their split in their classes was something like 60/40 male/female but the number of female attendees to the conference could have all fit in the same hotel shuttle van. What about open source development, open mailing lists and technology conferences is keeping the qualified women who do exist away and not present?

Cultural diversity

Posted Jun 3, 2013 13:28 UTC (Mon) by k8to (guest, #15413) [Link]

It's worth noting

* Many women who are interested in CS give it up partway along. I have my theories as to why and personal stories, but that's not exactly rigorous.
* The participation of women in CS is much higher than that in open source.

Cultural diversity

Posted Jun 5, 2013 12:02 UTC (Wed) by Felix.Braun (guest, #3032) [Link]

Colour me naive, but why wouldn't young girls aspire to be like Wozniak, Jobs, Shuttleworth, Torvalds etc? Are you saying that just because they are girls they need different role models? Isn't that sexist?

Cultural diversity

Posted Jun 5, 2013 12:08 UTC (Wed) by hummassa (guest, #307) [Link]

Yes, yes it is. You are absolutely right.

I don't have a good answer to your question. I *think* if youngsters have role models of the same ethnicity/gender/etc., they can more easily relate to those and detect which possibilities are within their reach. And I would especulate that it occurs possibly by means of some psychoevolutionary construct. But IANAPsychologist, TINPA.

Cultural diversity

Posted Jun 5, 2013 12:48 UTC (Wed) by mpr22 (subscriber, #60784) [Link]

My naive assumption, which almost certainly reflects a bunch of exciting cognitive biases of which I am inadequately conscious, would be that the more of one's obvious, large-group labels someone shares with you, the easier it is to take them as your role model.

Perhaps more importantly, the existence of people in the same large categories as oneself who do the thing one wants to do provides useful counterexamples when confronted with some reactionary idiot saying "but you can't do (X), you're a (Y)" (or its more insidious cousin, "are you sure you want to do (X)? That's not a very (Y-like) thing, wouldn't you rather do (Z)?").

Cultural diversity

Posted Jun 1, 2013 1:07 UTC (Sat) by ssmith32 (subscriber, #72404) [Link]

I think we can all agree with you that women are genetically predisposed to be able to give birth to children, while men are not.

It is a significant leap to then conclude that they are therefore genetically predisposed to be bad at kernel development.

That is the fork in the road where our minds begin to part ways..

And if it is then only a matter of culture, then we should fix the culture, and have twice as many people working on Linux, and the kernel will be better. And, as a matter of fact, the entire GNU/Apache/Linux ecosystem would then be better.

Therefore, changing the culture around kernel development (which everyone here seems to agree is predisposed towards men) to be as predisposed to women as it is now towards men, would be one of the sweetest technical hacks ever to grace the land of linux kerneldom, possibly short of Linus's original attempt, possibly not.

Take care,
-stu

Cultural diversity

Posted Jun 1, 2013 20:39 UTC (Sat) by SEJeff (guest, #51588) [Link]

Valerie Aurora (formerly Val Henson) is/was a pretty damned good kernel dev and might disagree with your numbers :)

Cultural diversity

Posted Jun 3, 2013 13:06 UTC (Mon) by ibukanov (subscriber, #3942) [Link]

Just think - who a male professor of math resembles more, a female professor of math or a male farmer?

Cultural diversity

Posted May 31, 2013 12:32 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

I might be flamed to death, but I can't stop showing this group picture from a midwifery conference.

Perhaps you should seek assistance in dealing with your urge to be gratuitously provocative.

Has anybody ever heard complaints about the lack of gender-diversity in that profession?

I haven't heard such complaints, but I dare say somebody has heard such complaints. More interesting is the question of whether anybody has heard a person with functioning social skills make such a complaint in good faith.

Cultural diversity

Posted May 31, 2013 15:24 UTC (Fri) by ewan (subscriber, #5533) [Link]

It is possible to make these sorts of complaints in good faith, but that wasn't what was being done here. It is not OK that some professions, including for example midwifery, nursing, and primary school teaching, have such a heavy gender bias, nor is the degree and nature of suspicion that faces any man who wants to go into a childcare profession either reasonable or acceptable.

NAR seems to think we shouldn't worry about either problem, whereas actually, both are worth worrying about. However, while the diversity problem in Free software is on topic here, the diversity problem in midwifery is not.

Cultural diversity

Posted Jun 6, 2013 11:58 UTC (Thu) by AndreE (guest, #60148) [Link]

Well, it probably IS worth thinking about why mid-wifery is so female dominated. If it is due purely to the attitudes of other mid-wives towards the capabilities of men in similar roles, then that would certainly be sexist behaviour. If it is because of uninformed and somewhat arbitrary societal expectations about the role of men and the type of jobs men should be limited to do, that would certainly highlight an entrenched sexist attitude within society.

There might also be well-motivated reasons, like the "sexual distance" between mother and mid-wife given the otherwise intimate nature of the practice, or the need for personal childbearing experience that only a mother can provide. That being said, there are plenty of male OBGYNs, male nurses, and a growing number of male midwives. The issue of discrimination has been some public visibility, and you can visit any number of nursing, parenting, or pregnancy forums to hear (read) the sort of discussions about the profession that we have here.

In any case, even if the reasons are purely discriminatory and based on flawed assumptions about roles of men in society and their capabilities to perform certain jobs, sexism in mid-wifery doesn't excuse sexism in other domains.

As a final point, the term "midwife" defines someone who assists a woman during pregnancy, originating from the Old English terms "mid" meaning "with" and "wif" meaning "woman". So the term is hardly sexist in and of itself, and men can quite logically be classified as "midwives"

Cultural diversity

Posted Jun 6, 2013 13:28 UTC (Thu) by farnz (subscriber, #17727) [Link]

It's odd that you bring that up; on the one occasion I've had significant time to chat with members of the midwifery profession, the topics covered included them bemoaning the fact that they struggled to get men into the profession, and discussing what both they as individuals and the profession as a whole is doing to encourage men to become midwives.

As Dirk and Linus were talking about kernel development (a subject that Linus and Dirk both have experience in), I'm not surprised that they didn't take the time to learn about all the other professions that have gender imbalances, and to discover whether people deeply involved in those professions have the same sorts of discussions about their gender imbalances as software developers do.

Cultural diversity

Posted Jun 8, 2013 14:27 UTC (Sat) by jch (guest, #51929) [Link]

> I might be flamed to death, but I can't stop showing this group picture from a midwifery conference. Has anybody ever heard complaints about the lack of gender-diversity in that profession?

A friend of mine who's in childcare (directing a nursery school) regularly complains about the lack of men in her profession. She's married to a computer engineer, and claims that the social issues they have due to too many women are similar to the problems we're having with too many men.

She also told me about the one man she knows in the profession, who complains about everyday discrimination at work — his female colleagues have a tendency to take over the more low-level jobs from him (say, washing a child who has pooped itself).

--jch


Copyright © 2013, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds