Show HN：一個萬物皆為值的程式語言

Hacker News

大約 1 個月前

AI 生成摘要

此篇 Hacker News 的「Show HN」介紹了 Herd，一個簡單的直譯式程式語言，其所有資料型別（包括列表和字典）都採用傳值（pass-by-value）語義。此設計旨在透過消除引用計數垃圾回收器中的循環檢測來防止副作用並簡化記憶體管理。

GitHub - Jcparkyn/herd: A simple interpreted programming language where everything is pass-by-value

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

To see all available qualifiers, see our documentation.

A simple interpreted programming language where everything is pass-by-value

Jcparkyn/herd

Folders and files

Latest commit

History

Repository files navigation

Herd

Herd is a simple interpreted programming language where everything is a value.

Disclaimer: This is a hobby language, and you probably shouldn't use it for anything important.

What makes Herd special?

In Herd, everything is pass-by-value, including lists and dicts.
This means that when you pass a list or dict to a function, you can guarantee that the function won't modify your copy.

You can modify variables locally just like you would in an imperative language, but there will never be any side-effects on other copies of the value.

How does it work?

All reference types in Herd (e.g. strings, lists, dicts) use reference counting. Whenever you make a copy of a reference type, the reference count is incremented. Whenever you modify a reference type, one of two things happen:

There's one very convenient consequence of everything being a value: Reference cycles are impossible! This means the reference counting system doesn't need cycle detection, and can also be used as a garbage collector.

Comparison to other languages

Language tour

Hello world

Use the println function to print to the console. Strings are defined using single quotes.

Variables

Types

Values in herd have the following types:

Lists and dicts

You can use shorthand when creating dicts if the key name matches the variable name:

Blocks

In herd, blocks are defined using parentheses. If the last expression in the block is not terminated with a semicolon, its value becomes the value of the block.

Functions

All functions in herd are defined using the anonymous function syntax:

The body of the function is just a block expression, so you can omit the return statement if you want:

You can also omit the parentheses for simple expressions:

And there's an even shorter syntax for single-parameter functions:

Functions with no parameters can be defined using \, and called by passing a unit value ():

Functions with a variable number of arguments can be defined using .. for the last parameter:

Values and mutability

In herd, everything is immutable unless declared with var. This includes lists and complex objects that would be mutable in other languages.

Each copy of a value is a distinct copy, and modifications to one variable won't modify other variables - even for lists and dicts!

This also applies when passing values to functions:

Pattern matching

Use ! to destructure lists and dicts:

You can also use pattern matching in switch expressions to handle multiple cases:

You can use var to destructure to a mutable variable, or set to modify an existing variable:

The pipe operator

The pipe operator | can be used to chain function calls in a more readable way:

You can combine the pipe operator with set to modify variables in-place using |=:

Modules and imports

Export code from a file by returning it at the end of the file:

Import code from other files using the import function:

You can also import all code from a file:

Standard library

Standard library modules are already imported for you, and can be accessed from the imported modules:

Some very commonly used functions are also available globally:

Multithreading

Herd has built-in support for multithreading using the Parallel standard library module.

Use Parallel.parallelMap to map a function over a list in parallel:

Use Parallel.parallelRun to run multiple functions in parallel and wait for all of them to complete:

In herd, it is impossible to create a data race, because any mutations only affect the current thread.
You can safely pass complex data structures between threads without worrying about synchronization.
However (as with the rest of herd), each function will have its own copy of the data, so the only way to communicate between threads is via the return value.

Design choices

Dynamic typing
I'm not generally a fan of dynamic typing, but:

User-defined types
Herd currently has a very simple type system, with no user-defined types.
If I wanted to turn this into a production-ready language, I'd probably add Julia-esque structs and multiple dispatch (following the same immutability guarantees as the rest of the language).

Semicolons
This is mostly just because I'm too lazy to write a whitespace-sensitive parser.

var, set, and =
I took a different approach with the syntax here than most other languages I've seen, in particular that the default for = is to define immutable variables, and mutating them requires a dedicated set keyword.

So I made immutable variable definition the "default", and gave keywords to the other two operations.

Function syntax
Just trying out something different which is a bit more concise than most other languages.
This is sort of a hybrid between Rust's closure syntax (but with \ instead of |) and ML-style function calls (f x y) which require a bit less punctuation.

Currying (or lack thereof)
Currying has a lot of nice properties for language implementation and reasoning, but personally I don't think it's a good fit for herd and would make the language less approachable.

Performance

Herd uses a very naive single-pass JIT compiler to convert code to machine code at runtime, with cranelift for generating the final optimized machine code. This results in surprisingly good performance - not in the same league as modern JS runtimes, but competitive or faster than many interpreted languages (e.g. CPython).

Values in Herd are represented using NaN-boxing, so primitive types (number, bool, unit) can be stored without any heap allocation. I chose this over tagged pointers, because it makes it much easier to get good numerical performance (at least for 64-bit floats) without complex inlining logic.

The current biggest performance gaps in herd are:

Here are some benchmark numbers on an i5-13600KF, comparing herd to CPython 3.11 and JavaScript (Node.js 18.6) on a selection of scripts (see ./benchmarks for the full code):

Show HN: A Small Programming Language Where Everything is a Value