CHAPTER 1
INTRODUCTION
For the first time in history, and thanks to the exponential growth rate of computing power, an increasing number of scientists are finding that more time is spent creating, rather than executing, working programs. Indeed, much effort is spent writing small programs to automate otherwise tedious forms of analysis. In the future, this imbalance will doubtless be addressed by the adoption and teaching of more efficient programming techniques. An important step in this direction is the use of higher-level programming languages, such as F#, in place of more conventional languages for scientific programming such as Fortran, C, C++ and even Java and C#.
In this chapter, we shall begin by laying down some guidelines for good programming which are applicable in any language before briefly reviewing the history of the F# language and outlining some of the features of the language which enforce some of these guidelines and other features which allow the remaining guidelines to be met. As we shall see, these aspects of the design of F# greatly improve reliability and development speed. Coupled with the fact that a freely available, efficient compiler already exists for this language, no wonder F# is already being adopted by scientists of all disciplines.
1.1 PROGRAMMING GUIDELINES
Some generic guidelines can be productively adhered to when programming in any language:
Correctness over performance Programs should be written correctly first and optimized last.
Factor programs Complicated or common operations should be factored out into separate functions or objects.
Interfaces Abstract interfaces should be designed and concrete implementations should be coded to these interfaces.
Avoid magic numbers Numeric constants should be defined once and referred back to, rather than explicitly “hard-coding” their value multiple times at different places in a program.
Following these guidelines is the first step towards reusable programs.
1.2 A BRIEF HISTORY OF F#
The first version of ML (Meta Language) was developed at Edinburgh University in the 1970’s as a language designed to efficiently represent and manipulate other languages. The original ML language was pioneered by Robin Milner for the Logic of Computable Functions (LCF) theorem prover. The original ML, and its derivatives, were designed to stretch theoretical computer science to the limit, yielding remarkably robust and concise programming languages without sacrificing the performance of low-level languages.
The Categorical Abstract Machine Language (CAML) was the acronym originally used to describe what is now known as the Caml family of languages, a dialect of ML that was designed and implemented by Gérard Huet at the Institut National de Recherche en Informatique et en Automatique (INRIA) in France, until 1994. Since then, development has continued as part of projet Cristal, now led by Xavier Leroy. Objective Caml (OCaml) is the current flagship language of projet Cristal. The OCaml programming language is one of the foremost high-performance and high-level programming languages used by scientists on the Linux and Mac OS X platforms [11].
Don Syme at Microsoft Research Cambridge has meticulously engineered the F# language for .NET, drawing heavily upon the success of the CAML family of languages. The F# language combines the remarkable brevity and robustness of the Caml family of languages with .NET interoperability, facilitating seamless integration of F# programs with any other programs written in .NET languages. Moreover, F# is the first mainstream language to implement some important features such as active patterns and asynchronous programming constructs.
1.3 BENEFITS OF F#
Before delving into the syntax of the language itself, we shall list the main, advantageous features offered by the F# language:
Safety F# programs are thoroughly checked prior to execution such that they are proven to be entirely safe to run, e.g. a compiled F# program cannot cause an access violation.
Functional Functions may be nested, passed as arguments to other functions and stored in data structures as values.
Strongly typed The types of all values are checked during compilation to ensure that they are well defined and validly used.
Statically typed Any typing errors in a program are picked up at compile-time by the compiler, instead of at run-time as in many other languages.
Type inference The types of values are automatically inferred during compilation by the context in which they occur. Therefore, the types of variables and functions in F# code rarely need to be specified explicitly, dramatically reducing source code size. Clarity is regaining by displaying inferred type information in the integrated development environment (IDE).
Generics Functions are automatically generalized by the F# compiler, greatly simplifying the writing of reusable functions.
Pattern matching Values, particularly the contents of data structures, can be matched against arbitrarily-complicated patterns in order to determine the appropriate course of action.
Modules and objects Programs can be structured by grouping their data structures and related functions into modules and objects.
Separate compilation Source files can be compiled separately into object files that are then linked together to form an executable or library. When linking, object files are automatically type checked and optimized before the final executable is created.
Interoperability F# programs can call and be called from programs written in other Microsoft .NET languages (e.g. C#), native code libraries and over the internet.
1.4 INTRODUCING F#
F# programs are typically written in Microsoft Visual Studio and can be executed either following a complete build or incrementally from the F# interactive mode. Throughout this book we shall present code snippets in the form seen using the F# interactive mode, with code input following the prompt:
Setup and use of the interactive mode is covered in more detail in chapter 2. Throughout this book, we assume the use of the #light syntax option, which requires the following command to be evaluated before any of the code examples:
Before we consider the features offered by F#, a brief overview of the syntax of the language is instructive, so that we can provide actual code examples later. Other books give more systematic, thorough and formal introductions to the whole of the F# language [25, 22].
1.4.1 Language overview
In this section we shall evolve the notions of values, types, variables, functions, simple containers (lists and arrays) and program flow control. These notions will then be used to introduce more advanced features in the later sections of this chapter.
When presented with a block of code, even the most seasoned and fluent programmer will not be able to infer the purpose of the code. Consequently, programs should contain additional descriptions written in plain English, known as comments. In F#, comments are enclosed between (* and *) or after / / or / / / on a single line. Comments appearing after a / / / are known as autodoc comments and Visual Studio interprets them as official documentation according to standard .NET coding guidelines.
Comments may be nested, i.e. (* (* … *) *) is a valid comment and comments are treated as whitespace, i.e. a (* … *) b is understood to mean a b rather than ab.
Just as numbers are the members of sets such as the integers
, reals
, complexes
and so on, so
values in programs are members of sets. These sets are known as
types.
1.4.1.1 Basic types
Fundamentally, languages provide basic types and, often, allow more sophisticated types to be defined in terms of the basic types. F# provides a number of built-in types, such as unit, int, float, char, string and bool. We shall examine these built-in types before discussing the compound tuple, record and variant (also known as discriminated union) types.
Only one value is of type unit and this value is written () and, therefore, conveys no information. This is used to implement functions that require no input or expressions that return no value. For example, a new line can be printed by calling the print_newline function:
> print_newline ();;
val it : unit = ()
This function requires no input, so it accepts a single argument () of the type unit, and returns the value () of type unit.
Integers are written -2, -1, 0, 1 and 2. Floating-point numbers are written -2.0, -1.0, -0.5, 0.0, 0.5, 1.0 and 2.0. Note that a zero fractional part may be omitted, so 3.0 may be written 3., but we choose the more verbose format for purely esthetic reasons. For example:
> 3;;
val it : int = 3
> 5.0;;
val it : float = 5.0
Arithmetic can be performed using the conventional +, -, *, / and % binary infix1 operators over many arithmetic types including int and float.
For example, the following expression is evaluated according to usual mathematical convention regarding operator precedence, with multiplication taking precedence over addition:
...