Jul 24, 2014

Does the world want or need a new programming language?


I recently reviewed notes going back to 2002 on the design of a programming language I've been calling Stonewolf. I have started to implement the language at least 4 times and have redesigned it at least a dozen times since then. Why do I continuously throw it away and start over? The main reason is that I have never convinced myself that my reasons for wanting it were suffieciently strong to be worth the effort of a project that could well take me the rest of my life.

My original reason for wanting a new programming language was the huge amount of pure crap you have to code to get anything done these days. The last commercial project I worked on was not that complex. The formal design document was only a dozen pages long. But, I can describe it in a paragraph. I had a bunch of original images stored as blobs in a database along with an image policy for various mobile devices. The job was to created versions of the images tailored for each mobile device. But, to do it on the fly and to cache all the new versions of the images. If originals were deleted the system was supposed to delete the cached reformatted images.

Not too complex, right? The job was done in Java using the Java imaging API and the Java database API. What took most of the time? Writing code to read and write the image files and converting to and from the format supported by the various APIs. Read the data in one format, convert it to another format, push it through the API, convert it to another format, write it out. Lots of work that had to be done because the different APIs were not compatible. Days spent writing code that just shuffled data from one structure to another structure... what a waste of time.

Most of the projects I've done lately are like that, pick the APIs, and spend all the project time writing code to duct tape them together in to something that resembles an application. What a waste of time.

Another example is some code I wrote developing a solution to the so called “nearest neighbor” problem. It is a hard problem but I came up with a pretty good solution. I developed it in C++. As I worked I realized I was spending more time working through C++'s extraordinarily complex type declarations than I spent working on the problem. The types I wanted were there, but to use them I had to write these obscenely complex type declarations. Of course, I was developing the thing as a template so I could adapt it to store multiple types of objects. That made it even worse. Testing templates is not easy. I thought I had written a complete test suite. Not so, first time I tried to use the template in a program I found that a whole section of the code had never even been compiled, and in fact, did not compile. Did not, because it could not. The code had never even been parsed.

Why does it have to be so hard? If I had never used a system that wasn't so absurdly complex I might just accept it as normal and move on. But, as a young programmer I learned LISP. Back in the early '70s I had pretty much every feature of C++ but without the complexity. By the early '80s when Standard Lisp came along you had everything C++ has now, but without the complexity.

(Lisp only has a couple of problems, first off nobody likes the parenthesis, and it is named “Lisp”. Very few managers have the balls needed to go before their management and say they are going to use “lisp” for a mission critical system. Three of the greatest languages in history, Lisp, Scheme, and Smalltalk, were all killed by their names. After all, what “Real Man, He Man, Programmer” wants to go to a party and say they Lisp, Scheme, or Smalltalk all day?)

My number one open source project for many years was SDL. A great project. But, what is it? A cross platform pile of duct tape that attempts to provide a consistent, integrated, API using many different libraries on different platforms that all do the same thing in different ways. The reason I first got involved with SDL was because I was looking for a portability platform for Stonewolf. By the way, SDL is an excellent pile of duct tape. It saves the programmer an enormous amount of time. But, it is still duct tape being used to cover over the absurd complexity that has been created in the form of languages, libraries, and operating systems.

Every time I started to work on Stonewolf I ran into questions that I could not answer. That is a good thing really. It made me do a lot of research. Just Unicode cost me months of reading and studying. Ever thought about the problem of defining “<” on strings with three cases and two ways to represent every letter that has a modifier?

So, finally I just started collecting reasons why I wanted a new programming langugue. I have quite a pile and it keeps growing. Each reason could be the subject of a very long article. Here they are, in no particular order, and certainly not finished yet.

I want a language that:
  1. Understands that Moore's law is an unstoppable force. Every 20 years Moore's law lets me buy 1024 times as much hardware for fewer dollars.
  2. Understands that my life span is more valuable than computer machine cycles. It has to get rid of all the dumb ass time wasters built into so many programming languages.
  3. Understands that multi core and 64 bits is the norm, not the exception. Every year the number of cores is going to increase.
  4. Understands that there is such a thing as a network and is happy to work with it.
  5. Does NOT try to make every I/O device look like a stream. Most things that are treated as streams are really random access devices. The rest are better thought of as message/event devices.
  6. Is more like a Swiss army knife and less like duct tape. The way to add new “libraries” is to integrate them into the language, not just provide a thin layer that exposes the API worts and all. It should be like SDL. If it has to provide duct tape, it should provide excellent duct tape.
  7. Cleans up the conflation of concepts that are built into most programming languages. For example: Variables have values. Values have types. Variables do not have types so why do we treat them like they do?
  8. Supports language extension through generic programming and an extensible set of operators without an abomination like C++ templates or the need for C++ style operator declarations. (I have some ideas... but this is a tough problem.)
  9. Is very easy to read. There can be no ambiguity about where a control structure ends, or begins for that matter. Likewise the syntax should make it easier for the language implementation to identify semantic errors. How many days of your life have been wasted looking for the place where you typed “=” instead of “==”? How much life span has been wasted looking for typos like the misplaced semicolon in “if(x < 0);”? C like syntax is full of gotchas like that one.
  10. Has a rich set of control structures. Why can't I do an SQL like select on an object container and get back a vector of objects that meet my criteria? Why aren't threads built into all programming languages? At the runtime library level?
  11. Supports more than one human language. I am not talking about locales and Unicode, I am talking about supporting multiple human languages at the source code level. Maybe the hardest problem on the list.

Oh, by the way, I also want a great IDE for the language and there should be versions for every kind of machine there is and every operating system and even for embedded systems with no OS. And, yes, I want an egg in my beer...

I'm sure to come back to this and add more “wants” to the list. But, for now this is enough to keep me busy for a very long time.

The fact is that large numbers of people have been programming since the 1950s. And yet, most of the languages we are using perpetuate mistakes that first appeared the 1950s and 1960s. Now days it is common to find people with 40+ years of programming experience. Dennis Ritchie was only 28 or 29 when he started developing C and in is early 30s when the first version was called “complete”. (If you ever programmed in K&R C you know it was no where near complete.) Brain Kernigan was a few months younger that Ritchie. As brilliant as they are they had not lived long enough to acquire the mature understanding of the practice of programming that someone with 40+ years of experience has. They most certainly did not have an understanding of 21st century computing environments. (It may seem that I am picking on C, but I use C for my examples because so many people know it and so many languages perpetuate the mistakes of C. And, it is one of my favorite languages!)

Why do we still use languages that do not reflect the environment or experience of modern programmers?

Can I solve all these problems? I doubt it, but I can try and in trying I can at least hope to get people thinking about solutions. The group is often smarter than the individual. But, is it even worth trying? Does anyone but me care about these problems?

11 comments:

  1. You might want to check out Julia. I think it does a pretty good job at 3, 7, 8 and 9. Most of the other complaints have more to do with the ecosystem (and are among the goals of Julia package maintainers) or I didn't really get what you meant.

    http://julialang.org/

    ReplyDelete
  2. You may want to try Racket (racket-lang.org); especially since you're already familiar with Lisps.

    ReplyDelete
  3. If you admire C and the UNIX heritage in general, I would strongly encourage you to take a look at Google's Go language (http://golang.org/). Bell Labs alums Rob Pike and Ken Thompson are principle authors of the language. I would claim it does a great job addressing your requirements 1-4, and 9.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. I know I will sound like just another fanboy but I will say it anyway. Learn some Erlang or Elixir these languages have real depth.

    http://www.erlang.org/
    http://learnyousomeerlang.com/content
    http://elixir-lang.org/

    ReplyDelete
  6. Some comments on the list:

    (1) "Moore's law is an unstoppable force" - nope, it really isn't. Processor speed stopped doubling about 5 years ago and has only increased incrementally since then. This was due to power density getting too high, so while CPUs hit a wall, memory (which is not known for overheating) has continued getting denser and cheaper. We're getting more cores, but I have the impression we're close to a limit. Not fundamental physical limits but limits of the current technology path.
    (2) Therefore, although programmer time is expensive, it will never be the only consideration. Making dynamic languages that are incapable of high performance is the wrong approach. It is better to make a language that simultaneously enables Rapid Application Development and fast runtime performance.
    (3) Don't worry, multiple new languages are supporting multithreading better, although we still haven't found many silver bullets.
    (4) I, too, wonder why networking isn't easier in most languages. Isn't this mostly a library problem, rather than a programming language problem?
    (5) Even if that's true, OSs still see files as streams; it's hard to paper over that. Besides, hard drives are vastly faster when files are scanned serially. (even SSDs are faster when scanned serially since they read whole blocks at a time.) PL design can't ignore reality.
    (6) I suppose you want the Wolfram Language, only open source? It's just a question of manpower, I think. Who can afford to build a standard library that provides everything that everyone needs and give it away for free? We'd all like that, and it will happen someday, but without a source of funding, it will be a very slow process.
    (7) WTF, are you railing against the entire concept of static typing? I specifically prefer statically typed languages because they make me more productive, see #2. C++ static typing sucks, but other languages such as Rust and Nemerle take the pain away.
    (8) Julia handles operators in a pretty nice way. The right solution involves treating operators as simply normal functions with a different syntax (ex. see LISP sweet expressions, or Haskell). Nemerle solves the problem of language extensibility well with its macro system, but it does not solve every problem. General language extensiblity is a hard problem and people like me will be working on it for years to come.
    (9) Um... I almost never mix up = and ==, but I do think that every C/C++ compiler ought to warn about "if (x = y)". In case that's really what you meant, you could write "if ((x = y))" to suppress the warning. Most compilers usually do warn about the silly statement "x == y;". The MS C# compiler warns about "if (x == y);" with "possible mistaken empty statement". I have noticed that GCC often has unfriendly output; but while C++ has many flaws, this is a solved problem so I'd blame the compiler and not the language in these cases.
    (10). Pretty much all the new languages support something along the lines of C#'s Language Integrated Query (LINQ). Unlike SQL they don't do query optimization, but in languages with a LISP-style macro system, you could add that as a library.
    (11) I have been thinking about that problem myself for some years, and I think it has a close connection to "duck taping things together". It's not uncommon that some simple objects written by different people to perform similar tasks offer the same functionality under different names, e.g. "Length" vs "GetLength". A language that offers a way to map APIs between different human languages can also allow you to map APIs between different English naming schemes. As for a language that allows you to use non-English keywords, my own LES (Loyc Expression Syntax) is kind of interesting since it has no keywords but lets you use any word (of any language) "as if" it were a keyword...

    Size limit reached...

    ReplyDelete
    Replies
    1. Moore's law is not about speed. It is about the number of components you can put on a chip. It shows no sign of stopping.

      Delete
  7. ...therefore I start a new comment...

    You almost certainly don't need to make a new programming language yourself. The new ones being developed are pretty good. I think what we need right now is to figure out how to allow different languages to talk to each other, or how to convert code between languages. Because the duck tape we have isn't good enough yet.

    The problem as I see it is that tool developers are going "madly off in all directions", each making little tools that solve one specific problem in one specific language or one specific context. Not many people are looking at the big picture. My project, Loyc (http://loyc.net) is all about the big picture, but it cannot succeed as long as it is a solo job of only one person. But, you know, there is no funding for the big picture. And there is no money in making the ultimate programming language.

    ReplyDelete
  8. We STILL build software like the Wright brothers built airplanes: Build something, push it off a cliff, watch it crash, then bring it back up to the cliff, fix what broke, and repeat.

    At least the Wrights didn't have to cope with some idiot project manager to arbitrarily declare it "done." They repeated until they had, indeed, flown (albeit 120 feet!).

    I'd love to see what a general-purpose programming language you'd design. Today we have a choice: Make all the decisions about the programmers' intent in the compiler (i.e., the program emerges from the first successful compile as a working, testable module) or in some interpreter (i.e., you can write any old slop, and if it passes the crude syntax, semantics and pragmatics are hidden in some interpreter). It seems to me there's a better way.

    I agree that Lots of Irritating Single Parentheses was a superb language...for a certain class of task.

    ReplyDelete
  9. I hereby invoke Greenspun's tenth rule :)

    ReplyDelete