Monday, 29 April 2019

The “Blub Paradox” and C++

Here is another of my answers from Stack Overflow that is getting down votes:

is there some powerful language feature or idiom that you make use of in a language that would be hard to conceptualize or implement if you were writing only in c++?
Are there any useful concepts or techniques that you have encountered in other languages that you would have found difficult to conceptualize had you been writing or "thinking" in c++?
C++ makes many approaches intractable. I would go so far as to say that most of programming is hard to conceptualize if you limit yourself to C++. Here are some examples of problems that are much more easily solved in ways that C++ makes hard.

Register allocation and calling conventions

Many people think of C++ as a bare metal low level language but it really isn't. By abstracting away important details of the machine, C++ makes it hard to conceptualize practicalities like register allocation and calling conventions.
To learn about concepts like these I recommend having a go at some assembly language programming and check out this article about ARM code generation quality.

Run-time code generation

If you only know C++ then you probably think that templates are the be-all and end-all of metaprogramming. They aren't. In fact, they are an objectively bad tool for metaprogramming. Any program that manipulates another program is a metaprogram, including interpreters, compilers, computer algebra systems and theorem provers. Run-time code generation is a useful feature for this.
I recommend firing up a Scheme implementation and playing with EVAL to learn about metacircular evaluation.

Manipulating trees

Trees are everywhere in programming. In parsing you have abstract syntax trees. In compilers you have IRs that are trees. In graphics and GUI programming you have scene trees.
This "Ridiculously Simple JSON Parser for C++" weighs in at just 484 LOC which is very small for C++. Now compare it with my own simple JSON parser which weighs in at just 60 LOC of F#. The difference is primarily because ML's algebraic datatypes and pattern matching (including active patterns) make it vastly easier to manipulate trees.
Check out red-black trees in OCaml too.

Purely functional data structures

Lack of GC in C++ makes it practically impossible to adopt some useful approaches. Purely functional data structures are one such tool.
For example, check out this 47-line regular expression matcher in OCaml. The brevity is due largely to the extensive use of purely functional data structures. In particular, the use of dictionaries with keys that are sets. That is really hard to do in C++ because the stdlib dictionaries and sets are all mutable but you cannot mutate a dictionary's keys or you break the collection.
Logic programming and undo buffers are other practical examples where purely functional data structures make something that is hard in C++ really easy in other languages.

Tail calls

Not only does C++ not guarantee tail calls but RAII is fundamentally at odds with it because destructors get in the way of a call in tail position. Tail calls let you make an unbounded number of function calls using only a bounded amount of stack space. This is great for implementing state machines, including extensible state machines and it is a great "get out of jail free" card in many otherwise-awkward circumstances.
For example, check out this implementation of the 0-1 knapsack problem using continuation-passing style with memoization in F# from the finance industry. When you have tail calls, continuation passing style can be an obvious solution but C++ makes it intractable.


Another obvious example is concurrent programming. Although this is entirely possible in C++ it is extremely error prone compared to other tools, most notably communicating sequential processes as seen in languages like Erlang, Scala and F#.

Monday, 22 April 2019

In terms of performance speed, is Swift faster than Java?

This is an old 2017 answer of mine from Quora that has been deleted by moderators for violating their rules, kept here for historical interest:
Many people here are claiming "yes" that Swift is fast. However, all of the benchmark results I can find indicate that Swift is much slower than most other languages, up to 24x slower than C++:
  • n-body: swift 24.09s vs Java 22.6s.
  • fannuck-redux: swift 59.5s vs Java 17.41s. Swift is 3.4x slower than Java.
  • spectral-norm: swift 15.7s vs Java 4.28s. Swift is 3.7x slower than Java.
EDIT: The shootout appears to have been updated. The new Swift code is still memory unsafe and still compiled in memory unsafe -Ounchecked mode. Swift now beats Java on 3/7 tasks (fannkuch-redux, mandelbrot and binary-trees), roughly draws on two (nbody and fasta-redux) and loses on two (fasta and spectral norm). Shame there is no memory safe Swift for comparison. I’d also note that many of these benchmarks are apples vs oranges comparisons with, for example, the Java and Swift implementations of binary-trees using completely different algorithms and data structures.
Other (non-Java) benchmarks found:
My impression is that Swift was vastly slower than almost all other languages at many common tasks when it was first released but it has improved very rapidly. However, it still seems to be several times slower than C++ in many cases and, therefore, I expect it is still significantly slower than languages like F# and Scala.
I am also concerned about many of the benchmarks being used. The shootout is notoriously bad science with submissions subjectively "de-optimised" by its owner for being too fast rendering the results total garbage. Numerical benchmarks like DGEMM and FFTs are largely irrelevant. I am much more interested in symbolic performance, not just because that is more relevant to the code I write but because it will stress Swift's reference counting garbage collection which I suspect will be its Achilles heel. The JSON serialization benchmark was very interesting to me as a consequence. I am also disturbed by the use of optimisations that remove memory safety in Swift (-Ounsafe) and the use of inaccurate numerics in C++ (-ffast-math).
A post on Hacker News (Swift Performance: Too Slow for Production) says that short-lived objects in the JSON serializer are to blame for the poor performance. In other words, the reference counted memory management is the problem as I suspected.