Brainfuck
From Wikipedia, the free encyclopedia
The brainfuck language is an esoteric programming language noted for its extreme minimalism. It was designed to challenge and amuse programmers, and is not suitable for practical use. Its name has been variously bowdlerized, as in brainf*ck or brainfsck, since its name contains the expletive "fuck". The name of the language is generally not capitalized, despite the fact that it is a proper noun.
Contents |
[edit] Language design
Urban Müller created brainfuck in 1993 with the intention of designing a language which could be implemented with the smallest possible compiler [1], inspired by the 1024 byte compiler for the FALSE programming language. Several brainfuck compilers have been made smaller than 200 bytes. The classic distribution is Müller's version 2, containing a compiler for the Amiga, an interpreter, example programs, and a readme document.
The language consists of eight commands, listed below. A brainfuck program is a sequence of these commands, possibly interspersed with other characters (which are ignored). The commands are executed sequentially, except as noted below.
The brainfuck language uses a simple machine model consisting, besides the program, of an array of 30,000 byte cells initialized to zero, a movable pointer into the array (initialized to point to the leftmost byte of the array), and two streams of bytes for input and output (most often connected to a keyboard and a monitor respectively, and using the ASCII character encoding).
[edit] Commands
The eight language commands, each consisting of a single character, are the following:
Character | Meaning |
---|---|
> |
increment the pointer (to point to the next cell to the right). |
< |
decrement the pointer (to point to the next cell to the left). |
+ |
increment (increase by one) the byte at the pointer. |
- |
decrement (decrease by one) the byte at the pointer. |
. |
output the value of the byte at the pointer. |
, |
accept one byte of input, storing its value in the byte at the pointer. |
[ |
jump forward to the command after the corresponding ] if the byte at the pointer is zero. |
] |
jump back to the command after the corresponding [ if the byte at the pointer is nonzero. |
(Alternatively, the ]
command may instead be translated as an unconditional jump to the corresponding [
command, or vice versa; programs will behave the same but will run more slowly.)
Brainfuck programs can be translated into C using the following substitutions, assuming ptr
is of type unsigned char*
and has been initialized to point to an array of zeroed bytes:
brainfuck command | C equivalent |
---|---|
> |
++ptr; |
< |
--ptr; |
+ |
++*ptr; |
- |
--*ptr; |
. |
putchar(*ptr); |
, |
*ptr=getchar(); |
[ |
while (*ptr) { |
] |
} |
As the name suggests, brainfuck programs tend to be difficult to comprehend. Partly this is because any mildly complex task requires a long sequence of commands; partly it is because the program's text gives no direct indications of the program's state. These, as well as brainfuck's inefficiency and its limited input/output capabilities, are some of the reasons it is not used for serious programming. Nonetheless, like any Turing-complete language, brainfuck is theoretically capable of computing any computable function or simulating any other computational model, if given an unlimited memory store[2]. A variety of brainfuck programs have been written[3].
[edit] Brainfuck's formal "parent language"
Except for its two I/O commands, brainfuck is a minor variation of the formal programming language P prime prime (P′′) created by Corrado Böhm in 1964. (In fact, using six symbols equivalent to the respective brainfuck commands +, -, <, >, [, ], Böhm provided an explicit program for each of the basic functions that together serve to compute any computable function. So in a very real sense, the first "brainfuck" programs appear in Böhm's 1964 paper – and they were programs sufficient to prove Turing-completeness.)
[edit] Examples
[edit] Hello World!
The following program prints "Hello World!" and a newline to the screen:
++++++++++ [>+++++++>++++++++++>+++>+<<<<-] The initial loop to set up useful values in the array >++. Print 'H' >+. Print 'e' +++++++. Print 'l' . Print 'l' +++. Print 'o' >++. Print ' ' <<+++++++++++++++. Print 'W' >. Print 'o' +++. Print 'r' ------. Print 'l' --------. Print 'd' >+. Print '!' >. Print newline
For readability, this code has been spread across many lines and comments have been added. Brainfuck treats all characters but +-<>[],.
as comments so no special syntax for a comment is needed. The code could just as well have been written as:
++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.
The first line initialises a[0] = 10
by simply incrementing ten times from 0. The loop from line 2 effectively sets the initial values for the array: a[1] = 70 (close to 72, the ASCII code for the character 'H'), a[2] = 100
(close to 101 or 'e'), a[3] = 30
(close to 32, the code for space) and a[4] = 10
(newline). The loop works by multiplying the value of a[0]
, 10
, by 7, 10, 3, and 1, saving the results in other cells. After the loop is finished, a[0] is zero. >++.
then moves the pointer to a[1]
which holds 70
, adds two to it (producing 72 which is the ASCII character code of a capital H), and outputs it.
The next line moves the array pointer to a[2]
and adds one to it, producing 101
, a lower-case 'e', which is then output.
As 'l' happens to be the seventh letter after 'e', to output 'll' we add another seven (+++++++
) to a[2]
and output the result twice.
'o' is the third letter after 'l', so we increment a[2]
three more times and output the result.
The rest of the program goes on in the same way. For the space and capital letters, different array cells are selected and incremented or decremented as needed.
[edit] Trivial
[edit] Cell-clear
[-]
A simple program fragment which sets the current location to 0, by iteratively decrementing until it is equal to 0.
[edit] Simple loop
,[.,]
A continuous loop that takes text input from the keyboard and echoes it to the screen (similar to the Unix cat program). Note that this assumes the cell is set to 0 when a ',
' command is executed after the end of input (sometimes called end-of-file or "EOF"); implementations vary on this point. For implementations that set the cell to -1 on EOF, or leave the cell's value unchanged, this program would be written ",+[-.,+]
" or ",[.[-],]
" respectively.
[edit] Moving the pointer
>,[.>,]
A version of the last one that also saves all the input in the array for future use, by moving the pointer each time.
[edit] Add
[->+<]
This adds the current location (destructively, it is left at zero) to the next location.
[edit] Conditional loop statements
,----------[----------------------.,----------]
This will take lowercase input from the keyboard and make it uppercase. To exit, press the enter key.
First, we input the first character using the ,
and immediately subtract 10 from it. (Most, but not all, brainfuck implementations use 10 as the code for a newline character.) If the user hits enter, the loop command ([
) will jump past the end of the loop, because we will have set the first byte to zero. If the character input was not a 10, we boldly assume it was a lowercase letter, and enter the loop, wherein we subtract another 22 from it, for a total of 32, which is the difference between an ASCII lowercase letter and the corresponding uppercase letter.
Next we output it. Now we input the next character, and again subtract 10. If this character was a linefeed, we exit the loop; otherwise, we go back to the start of the loop, subtract another 22, output, and so on. When we exit the loop, the program terminates, as there are no more commands.
[edit] Copying a byte
(Now things start to get a bit more complicated. We may as well refer to the bytes in the array as [0], [1], [2], and so on.)
Brainfuck does not include any operation for copying bytes. This must be done with the looping construct and arithmetical operators. Moving a byte is simple enough; moving the value of [0] to [1] can be done as follows:
>[-]<[->+<]
However, this resets the value of [0] to 0. We can restore the value of [0] after copying by taking advantage of the ability to copy a value to two places at once. To copy the value of [0] to both [1] and [2] is simple:
>[-]>[-]<<[->+>+<<]
We can take advantage of this to restore the value of [0]. Therefore, we can nondestructively copy [0] to [1] (using [2] as scratch space) as follows:
>[-]>[-]<<[->+>+<<]>>[-<<+>>]<<
[edit] Addition
,>++++++[<-------->-],[<+>-],<.>.
This program adds two single-digit numbers and displays the result correctly if it too has only one digit:
43
7
The first number is input in [0], and 48 is subtracted from it to correct it (the ASCII codes for the digits 0-9 are 48-57). This is done by putting a 6 in [1] and using a loop to subtract 8 from [0] that many times. (This is a common method of adding or subtracting large numbers.) Next, the second number is input in [1].
The next loop [<+>-]
does the real work, moving the second number onto the first, adding them together and zeroing [1]. Each time through, it adds one to [0] and subtracts one from [1]; so by the time [1] is zeroed, as many have been added to [0] as have been removed from [1]. Now a return is input in [1]. (We're not error-checking the input at all.)
Then the pointer is moved back to the [0], which is then output. ([0] is now a + (b + 48), since we didn't correct b; which is identical to (a + b) + 48, which is what we want.) Now the pointer is moved to [1], which holds the return that was input; it is now output, and we're done.
Apparently, some implementation prefers this variant which does not use linefeeds at all:
,>------[<-------->+],[<+>-]<.
[edit] Multiplication
,>,>++++++++[<------<------>>-] <<[>[>+>+<<-]>>[<<+>>-]<<<-] >>>++++++[<++++++++>-],<.>.
Like the previous, but does multiplication, not addition.
The first number is input in [0], the second number is input in [1], and both numbers are corrected by having 48 subtracted.
Now we enter the main multiplication loop. The basic idea is that each time through it we subtract one from [0] and add [1] to the running total kept in [2]. In particular: the first inner loop moves [1] onto both [2] and [3], while zeroing [1]. (This is the basic way to duplicate a number.) The next inner loop moves [3] back into [1], zeroing [3]. Then one is subtracted from [0], and the outer loop is ended. On exiting this loop, [0] is zero, [1] still has the second number in it, and [2] has the product of the two numbers. (Had we cared about keeping the first number, we could have added one to [4] each time through the outer loop, then moved the value from [4] back to [0] afterward.)
Now we add 48 to the product, input a return in [3], output the ASCIIfied product, and then output the return we just stored.
[edit] Division
This example accepts two numbers from the user, divides them, and displays the truncated quotient.
The following code prompts the user for two numbers; the dividend is stored in [0] and the divisor stored in [1] The code then enters the main loop. With each iteration of the loop, the code subtracts the divisor from the dividend. If the difference is greater than zero, the cell holding the quotient is incremented, and the process repeated until the dividend reaches zero.
,>,>++++++[-<--------<-------->>] Store 2 numbers from kb in (0) and (1); and subtract 48 from each <<[ Loop until the dividend is zero >[->+>+<<] Move the divisor in (1) to (2) and (3); setting (1) to zero >[-<<- Subtract 1 from both the dividend(0) and the divisor(2) until (2) is zero [>]>>>[<[>>>-<<<[-]]>>]<<] If the dividend is zero; exit the loop >>>+ Add one to the quotient in (5) <<[-<<+>>] Move the saved divisor in (3) to (2) <<<] Move ptr to {0} and repeat loop >[-]>>>>[-<<<<<+>>>>>] Move the quotient in (5) to (0) <<<<++++++[-<++++++++>]<. Add 48 and print result
[edit] Portability issues
Partly because Urban Müller did not write a thorough language specification, the many subsequent brainfuck interpreters and compilers have come to use slightly different dialects of brainfuck.
[edit] Cell size
In the classic distribution, the cells are 8-bit bytes, and this is still the most common size. However, to read non-textual data, a brainfuck program may need to distinguish an end-of-file condition from any possible byte value; thus 16-bit cells have also been used. Some implementations have used 32-bit cells, 64-bit cells, or bignum cells with practically unlimited range, but programs that use this extra range are likely to be slow, since storing or using cell values generally takes time proportional to the values stored or used.
In all these variants, the ,
and .
commands still read and write data in bytes. In most of them, the cells wrap around, i.e. incrementing a cell which holds its maximal value (with the +
command) will bring it to its minimal value and vice versa. The exceptions are implementations which are distant from the underlying hardware, implementations that use bignums, and implementations that try to enforce portability.
Fortunately, it is usually easy to write brainfuck programs that do not ever cause integer wraparound or overflow. Such programs thus do not depend heavily on cell size. Generally this means avoiding increment of +255 (unsigned char wraparound); or avoiding the overstepping the boundaries of [-128, +127] inclusive (signed char wraparound). For more details on integer wraparound, see the Integer overflow article.
[edit] Array size
In the classic distribution, the array has 30,000 cells, and the pointer begins at the leftmost cell. Any brainfuck implementation should thus provide at least that many cells, but surprisingly many implementations provide fewer. Even more cells are needed to store things like the millionth Fibonacci number, and the easiest way to make the language Turing-complete is to make the array unlimited on the right.
A few implementations extend the array to the left as well; this is an uncommon feature, and therefore portable brainfuck programs do not depend on it.
When the pointer moves outside the bounds of the array, some implementations will give an error message, some will try to extend the array dynamically, some will not notice and will produce unpredictable behavior, and a few will move the pointer to the opposite end of the array. Some tradeoffs are involved: expanding the array dynamically to the right is the most user-friendly approach and is good for memory-hungry programs, but it carries a speed penalty. If a fixed-size array is used it is helpful to make it very large, or better yet let the user set the size. Giving an error message for bounds violations is very useful for debugging but even that carries a speed penalty unless it can be handled by the operating system's memory protections.
[edit] End-of-line code
Different operating systems (and sometimes different programming environments) use subtly different versions of ASCII. The most important difference is in the code used for the end of a line of text. MS-DOS and Microsoft Windows use a CRLF, i.e. a 13 followed by a 10, in most contexts. UNIX and its descendants, including Linux and Mac OS X, use just 10, and older Macs use just 13. It would be unfortunate if brainfuck programs had to be rewritten for different operating systems. Happily, a unified standard is easy to find. Urban Müller's compiler and his example programs use 10, on both input and output; so do a large majority of existing brainfuck programs; and 10 is also more convenient to use than CRLF. Thus, brainfuck implementations should make sure that brainfuck programs that assume newline=10 will run properly; many do so, but some do not.
This assumption is also consistent with most of the world's sample code for C and other languages, in that they use '\n', or 10, for their newlines. On systems that use CRLF line endings, the C standard library transparently remaps "\n" to "\r\n" on output and "\r\n" to "\n" on input for streams not opened in binary mode.
[edit] End-of-file behavior
The behavior of the ,
command when an end-of-file condition has been encountered varies. Some implementations set the cell at the pointer to 0, some set it to the C constant EOF (in practice this is usually -1), some leave the cell's value unchanged. There is no real consensus; arguments for the three behaviors are as follows.
Setting the cell to 0 avoids the use of negative numbers, and makes it marginally more concise to write a loop that reads characters until EOF occurs. This is a language extension devised by Panu Kalliokoski.
Setting the cell to -1 allows EOF to be distinguished from any byte value (if the cells are larger than bytes), which is necessary for reading non-textual data; also, it is the behavior of the C translation of ,
given in Müller's readme file. However, it is not obvious that those C translations are to be taken as normative.
Leaving the cell's value unchanged is the behavior of Urban Müller's brainfuck compiler. This behavior can easily coexist with either of the others; for instance, a program that assumes EOF=0 can set the cell to 0 before each ,
command, and will then work correctly on implementations that do either EOF=0 or EOF="no change". It is so easy to accommodate the "no change" behavior that any brainfuck programmer interested in portability should do so.
[edit] Miscellaneous dialects
Many people have modified brainfuck, often by adding commands, occasionally by removing them. Some of these variants have value, but they should be (and usually are) given different names and considered separate languages, much like C++ with relation to C. Many of these are listed below. However, there are also minor variants, formed possibly as a result of inattention, of which some of the more common are:
- forbidding, rather than ignoring, any non-command characters in brainfuck programs
- introducing a comment marker which comments out the rest of the line
- various alterations of the loop semantics, sometimes destroying Turing completeness
- requiring a special character to mark the end of the program
[edit] Languages based on brainfuck
- Further information: Brainfuck derivatives on the Esolang wiki
Because brainfuck is so simple, it is easy to make other programming languages based on it:
- 2L: based loosely on PATH and Brainfuck, but has only two symbols.
- Boolfuck uses bits instead of bytes.
- Brainfork is a multi-threaded version of brainfuck with an additional "Y" instruction to fork the current thread.
- Brainfuck++ can be used in more applications by adding file I/O and networking support.
- Brainstab replaces the eight Brainfuck operators with phrases constructed from the words "stab" and "stabbity".
- Braintwist uses the eight operators from brainfuck, and an "X"-operator (uppercase "x") that swaps the data and code arrays, thus making it possible to store code in the data array, and execute it at a later time.
- COW reformats brainfuck's commands as various capitalisations of the word "Moo".
- Doublefuck has an additional array and eight additional instructions which perform brainfuck-identical operations on the second array. They are, in order: "^", "v", "/", "\", ":", ";", "{", and "}".
- f*ckf*ck replaces the Brainfuck operators with explicit words.
- Fm languages are program formulations of Turing machines, based on Böhm's (1964) language P′′ (a formal equivalent to brainfuck with no i/o instructions); e.g. F2 edits an unbounded bit-string memory using + < > [ ] only.
- l33t adds networking capabilities. (Very loosely based)
- L00P has an implicit loop, removes the "[" and "]" looping instructions and adds ten others.
- Nanopond Nanopond is an artificial life virtual machine whose simple evolvable instruction set is loosely based on Brainfuck.
- Ook! reformats brainfuck's commands as combinations of "Ook." "Ook!" and "Ook?".
- PATH and SNUSP are combinations of brainfuck with Befunge, a language which represents instructions as symbols in two-dimensional space.
- pbrain, which adds procedures to Brainfuck.
- Pi language obfuscates brainfuck commands as random errors in Pi digits.
- ReverseFuck uses "reversed" brainfuck's commands. ("+" means "-", "<" means ">", etc.)
- Spoon uses a Huffman coded set of brainfuck's instructions.
- THRAT uses two instructions to access brainfuck instructions on an opcode table.
- Trainfuck is another treatment of support for file I/O and networking; the page provides examples of usage.
- Self modifying Brainfuck is a modified version of Brainfuck which allows self modifying code.
[edit] External links
- Further information: Brainfuck on the esolangs wiki
- Brian Raiter, Muppetlabs. Brainfuck: An Eight-Instruction Turing-Complete Programming Language is a concise informational page with comments on portability.
- Panu Kalliokoski. The Brainfuck Archive has many brainfuck programs and implementations.
- Daniel Cristofani. some brainfuck fluff has a complete brainfuck reference, implementations and an amazing collection of programs.
- Frans Faase. Brainf*** is written from a mathematician's angle and has Turing-completeness proofs.
- A Brainfuck tutorial in English and French.
- Keymaker. bf-hacks.org features a collection of brainfuck programs.
- Brainfuck Algorithms and code snippets
- Lost Kingdom BF A conversational game written in brainfuck.
- Taking Over The World A game written in brainfuck.
- Rather theoretical elaboration about brainfuck incl. an interpreter/debugger-applet.
[edit] Implementations
- Brainfuck Developer, powerful development environment with integrated debugger (IDE) for Windows
- Delphi Brainfuck Interpreter a fast interpreter written in Delphi with a large collection of brainfuck scripts. Executable and source code included.
- Brainf*ck Pascal/Delphi implementation , simple and easy to understand
- TaccoMat the Brainfuck interpreter by Benjamin F. Loleit
- Also Written In Brainfuck (awib) is a brainfuck compiler written in brainfuck for Linux on i386.
- Brainfucked a small compatible open source compiler with syntax checking and code optimization for Microsoft Windows/DOS.
- Sascha Seidel, Brainfuck Express - Virtual Turing Machine - Development environment, editor, compiler, tools, tutorials, ASCII table and historical information about Brainfuck.
- Brainfuck Machine - BF IDE - Editor, compiler, debugger, interpreter, converter to other programming languages, ASCII table, changing BF instructions to own (for example + => a, - => b, ...) (for Windows)
- BFF is very fast optimizing interpreter written in portable C
- BFF4 is really fast interpreter written in C
- Blue Fern, a brainfuck IDE for Windows
- Brainfuck.Net
- Acme::Brainfuck, a Perl module that permits brainfuck programs to be embedded in Perl.
- Processing_BF is a PHP class featuring an optmizing parser, executor and PHP translator
- Minimalistic brainfuck interpreter in JavaScript, for people who do not want to spend ten minutes getting a compiler.
- brainf.py, a Python implementation by James Tauber.
- BrainF** Turbo Interpreter A fast web-based interpreter applet.
- kbfi A brainfuck interpreter written in brainfuck that can emulate cell values from zero to infinity. Designed in the traditional non-wrapping unsigned 8-bit cell (and non-wrapping array) brainfuck environment.
- BfI3 in English — brainfuck interpreter in JavaScript. Slow but features syntax highlighting. Doesn't work at all except on some versions of Internet Explorer.
- SchemeBF is fast brainfuck interpreter in Scheme.
- phpBrainfuckInterpreter is an Brainfuck interpreter which can be run from the web and can give verbose feedback as a program runs to aid in debugging.
- BrainF#ck is a full BrainFuck IDE for PalmOS. Complete with editor, step-by-step debugging and a runtime memory view.
- iBrainF*ck is another implementation of BrainFuck for PalmOS.
- Fred is a BrainFuck to C filter program, with ANSI C source under the GNU GPL.
- bfdb, an optimizing Brainfuck interpreter, debugger and compiler.
- libbf, a free portable library for Brainfuck fast interpretation, optimization, compilation and execution
- Branfuck.NET Compiler is a Brainfuck compiler for Microsoft .Net Framework. Compiles BF programms into Portable Executable files (exe)
- A Brainfuck Interpreter in PHP is a Brainfuck interpreter written in PHP
- Interpreter in Haskell is a Interpreter in Haskell
- Brain Editor and Fucker, or BEF*, is a BF interpreter written for TECO. The slim documentation can be found here and the commented version of the source here.
- Brainfuck Studio is brainfuck interpreter and debugger written in JavaScript.
- brainfoo is a small brainfuck interpreter written in C.
- Brainfuck VGA Brainfuck compiler for graphics programmers.
- http://www.michihiebl.de/?page_id=14 shows an online Version of a Brainfuckinterpreter written in PHP