Fuzion Logo
flang.dev — The Fuzion Language Portal
JavaScript seems to be disabled. Functionality is limited.

Strings

Strings for human readable output

A very frequent use of strings is for human readable output to a console, a log file, etc. A simple and easy syntax to create such strings simplifies everyday programming and debugging significantly.

Here is possible syntax to print the values of two variables x and y:

out.println("x is "+x+", y is "+y+"!");

This would imply a concatenation operator "+" on strings. It is nice since it provides no specific syntax, but it requires four additional characters "++" to include a single value x.

out.println("x is ",x,", y is ",y,"!");

This would imply println to have an open parameter list. It also does not require specific syntax, but it requires four additional characters ",," to include a single value x.

out.println("x is {x}, y is {y}!");

This is somewhat strange on the parsing, in particular if we do not print a variable, but an arbitrary expression such as y*y. On the pro side, it requires only two additional characters {} to include a single value x and it is somewhat nice to read.

A big disadvantage here is that this approach is not open to internationalization, i.e, replacing the strings with variables that depend on, e.g., the language environment.

println("x is $x, y is $y!");

This Kotlin-approach requires only one additional character $. However, to include more than a single field identifier or to separate it from following text, 3 additional characters are required, e.g., ''"length is ${length}cm"''.

out.println("x is " x ", y is " y "!");

This would require a sequence of strings and expressions to imply a conversion to strings and their concatenation. So pure sequencing is an operator. It requires two additional characters "" to include a single value x (the spaces are optional). The change of the grammar is minor.

printf("x is %d, y is %d!", x, y);

This C-printf style formatting is very flexible, but separates the formatting from the variable. It requires three extra characters %d, for each value printed.

std:cout << "x is " << x << ", y is " << y << "!\n";

The C++ style maybe looked cool 20 years ago, but is little helpful, it adds 6 characters "<<<<" for every value to be printed.

say ("Partition %5d into %2d prime piece" % (num, parts), parts == 1 ? ': ' : 's: ', prime_partition(num, parts).join('+') || 'not possible')

Sidef uses the operator '%' on Strings to format arguments provided as a tuple argument. Quite nice.

Fuzion Approach to human readable string

This main idea is to provide an abstract string class that can be converted into a list or stream of bytes. So strings do not need to be physically present in memory, string concatenation means creating a list (stream) by concatenating two existing lists (streams).

As in Java, infix + can be used as a standard way to concatenate strings with other values that are stringable, an abstract feature providing asString:

out.println("x is "+x+", x+y is "+(x+y)+"!");

Support for {x} notation withing strings can be reflected by the grammar by introducing tokens lstring, rstring and mstring for strings that end or start or end and start with braces.

Then, the string in

out.println("x is {x}, x+y is {x+y}!");

would by split by the lexer into three different string tokens t_lstring x is , an t_mstring , x+y is and an t_rstring !, and all the token of the expressions x and x+y normally in between. The parser could then be extended to support expressions of the form

string -> t_string | t_lstring { expr t_mstring }* expr t_rstring

and convert them into an AST that is equal to the code using normal strings and infix + explicitly.

A bit more tricky is handling of $ as in

out.println("x is $x, y is $y!");

The lexer could treat $ as a string termination character like ", convert the $x into a special identifier sident and parse the remainder as if a new string was started with ". The grammar would then become

string -> ( t_string | t_lstring { expr t_mstring }* expr t_rstring ) { t_sident string }

The big advantage: This whole string magic would be handled mostly by the lexer and in part by the parser. The problem is solved once the AST has been built.

Debugging output

Python permits handy debug string suing f'{x * y =}' as shorthand for f'x * y = {x * y}'. NYI: Would be convenient for debugging, might consider this for Fuzion as well.

Formatting

Python permits f'{num:.2f}' to use a type-specific formal string, in this case .2f for two decimals, when printing a value. The same could be achieved in Fuzion using, e.g., an operator infix $(string format), then we could have a Fuzion string like "{num$".2f"}" to achieve the same effect. NYI: is this useful enough to be supported?

Text Blocks

Where introduced in Java 15: JEP 378: Text Blocks. Basically using """ to start and end a multi-line string constant.