The BRZ got a useful idea for a Blockchain project: notarizing document existence via their cryptographic hashes, codename "Blockstempel".
Lines of Code
09. March 2022
After the Log4Shell debacle in December (no, I don't want to provide a zillion links) some security aspect comes up in discussions again: Lines of Code, ie. the attack surface of services.
As a measurement, "Lines of Code" spans a wide numerical range. From small 1-line libraries (cough NodeJS
is-promise cough) to millions of lines changed in every single Linux kernel release1.
Let's do a small comparison.
Now this is only the plain JDK - no Spring Boot nor any other libraries, dependencies, or actual application code. Some files may be misdetected - there are test files and other stuff included that's not being run in production, etc. - but we'll take that as a rough measure.
$ apt-get source sbcl ... Need to get 6.767 kB of source archives. Get:1 http://deb.debian.org/debian testing/main sbcl 2:2.1.11-1 (dsc) [2.565 B] Get:2 http://deb.debian.org/debian testing/main sbcl 2:2.1.11-1 (tar) [6.688 kB] Get:3 http://deb.debian.org/debian testing/main sbcl 2:2.1.11-1 (diff) [76,7 kB] ... $ cd sbcl-2.1.11/ $ LC_ALL=C sloccount . ... Totals grouped by language (dominant language first): lisp: 444220 (91.74%) ansic: 32577 (6.73%) sh: 4847 (1.00%) asm: 2532 (0.52%) cpp: 27 (0.01%) pascal: 5 (0.00%) Total Physical Source Lines of Code (SLOC) = 484,208
That's 5.7 Percent of the Java LOC - a bit more than one-twentieth; one-seventeenth, to be more precise.
Now, I'm not a Java person, so I can't really comment on that ecosystem; I wouldn't even know which libraries are needed or recommended. Instead of adding up wrong numbers, let's take only the POC that I know, the one referenced above (in a newer version), and get the numbers for the complete solution.
With Common Lisp being an interactive and introspectable system by design, the compiler and ASDF already record source locations of variables, constants, functions, structure/class definitions, libraries, and so on. I only need to add one special case: the overall SBCL source location, so that the whole SBCL implementation is counted in as well:
(let ((systems)) ;; Get all loaded systems (asdf:map-systems (lambda (f) (push f systems))) ;; Fetch source directory paths (let ((paths (list* (namestring (translate-logical-pathname #P"SYS:SRC;")) (loop for f in systems for src-dir = (asdf:system-source-directory f) for path = (when src-dir (namestring src-dir)) when path collect path))) (kept ())) ;; Reduce to the common base paths of ASDF subsystems (loop for p in (sort paths #'< :key #'length) unless (find-if (lambda (c) (alexandria-2:starts-with-subseq c p)) kept) do (push p kept)) ;; Run "sloccount" to get statistics (uiop:run-program (list* "env" "LC_ALL=C" "sloccount" ;"--details" ;; enable to get per-file LOCs kept) :output "/tmp/loc.txt" :error-output :string))) NIL "Warning: newline in string - file ...cxml-20200610-git/doc/index.xml, line 64 Warning: newline in string - file ...cxml-20200610-git/doc/index.xml, line 68 ... " 0
We get a few warnings, but the exit code is zero.
So, let's look at the results. The output file
/tmp/loc.txt contains the usual suspects (like
trivial-backtrace, and so on); 58 source locations in total. The LOC count shows:
So the full POC of an HTTPS-enabled application is less than 700 thousand lines of code2, or 8.2% of the Java 11 Development Kit only (no libraries, no frameworks, no application code counted there!)…
Turning that ratio around, with a Java solution there are (rather, would be -
Log4Shell shows nobody does that) more than 10 times as many LOC to review.
Ain't that a good reason (one of many) to learn a new3, more effective, programming language?
1Linux 5.14 from Git has 20755459 LOC, according to
sloccount; https://lwn.net/Articles/867540/ reports +861000 -321000 LOC.
2These 680 KLOC even include duplicated unicode tables – eg. SBCL has
src/code/external-formats/enc-cn-tbl.lisp (44973 LOC), while
flexi-streams-20210807-git provides a (semantically probably identical)
enc-cn-tbl.lisp (48314 LOC) – and then there's a
3Well, rather an old programming language, really.