MATLAB Tutorial

Floating Point Numbers in MATLAB

Floating point numbers are used in MATLAB to represent non-integer values, including decimals and very large or small numbers. MATLAB follows the IEEE 754 standard for floating-point arithmetic.

Understanding Floating Point Numbers

Floating point numbers consist of three parts:

Sign: Indicates whether the number is positive or negative.
Exponent: Determines the scale or magnitude of the number.
Fraction: Represents the precision of the number.

Creating Floating Point Numbers

Floating point numbers can be created simply by using decimals in MATLAB:


% Examples of floating point numbers 

a = 3.14; 

b = -0.001; 

c = 2.718e3;  % Scientific notation, equivalent to 2718 
 

disp(a); 

disp(b); 

disp(c);

Output will be:

3.14
-0.001
2718

Floating Point Precision

MATLAB supports two primary precision types:

Single Precision: Uses 4 bytes (32 bits).
Double Precision: Uses 8 bytes (64 bits) and is the default in MATLAB.


% Checking precision of numbers 

a = single(3.14);  % Single precision 

b = 3.14;          % Double precision 
 

disp(a); 

disp(b); 

disp(class(a));    % Output: 'single' 

disp(class(b));    % Output: 'double'

Common Issues with Floating Point Numbers

Floating point arithmetic is not always exact due to precision limits. For example:


% Floating point precision example 

a = 0.1 + 0.2; 

disp(a == 0.3);  % Returns false 

disp(a);         % Slightly different from 0.3

Output will be:

0
0.30000000000000004

Use the eps function to determine the precision threshold:


% Checking precision threshold 

disp(eps);       % Smallest difference between two floating-point numbers 

disp(eps(1.0));  % Epsilon value near 1

Output will be:

2.2204e-16
2.2204e-16

Floating Point Operations

Floating point numbers support standard arithmetic operations:


% Basic arithmetic with floating point numbers 

a = 5.5; 

b = 2.2; 

add = a + b; 

sub = a - b; 

mul = a * b; 

div = a / b; 
 

disp(add); 

disp(sub); 

disp(mul); 

disp(div);

Output will be:

7.7
3.3
12.1
2.5

Scientific Notation

Scientific notation is a convenient way to handle very large or small numbers in MATLAB:


% Examples of scientific notation 

a = 3.14e2;  % Equivalent to 3.14 * 10^2 

b = 1.6e-3;  % Equivalent to 1.6 * 10^-3 
 

disp(a); 

disp(b);

Output will be:

314
0.0016

Handling Special Numbers

MATLAB can handle special floating-point numbers such as infinity and NaN (Not-a-Number):


% Infinity and NaN examples 

a = 1/0;        % Positive infinity 

b = -1/0;       % Negative infinity 

c = 0/0;        % NaN 
 

disp(a); 

disp(b); 

disp(c);

Output will be:

Inf
-Inf
NaN

Comparing Floating Point Numbers

Direct comparison of floating-point numbers can be unreliable. Use a tolerance instead:


% Comparing floating-point numbers 

a = 0.1 + 0.2; 

b = 0.3; 

tol = 1e-10;  % Tolerance 
 

if abs(a - b) < tol 

    disp('a and b are approximately equal'); 

else 

    disp('a and b are not equal'); 

end

Output will be:

a and b are approximately equal

Rounding Floating Point Numbers

MATLAB provides several functions to round floating-point numbers:


% Rounding examples 

a = 3.14159; 

round_a = round(a);       % Round to nearest integer 

floor_a = floor(a);       % Round down 

ceil_a = ceil(a);         % Round up 
 

disp(round_a); 

disp(floor_a); 

disp(ceil_a);

Output will be:

3
3
4

Floating Point Limits

Understanding the limits of floating-point representation is essential:


% Checking floating-point limits 

max_val = realmax;   % Largest representable floating-point number 

min_val = realmin;   % Smallest positive normalized floating-point number 
 

disp(max_val); 

disp(min_val);

Output will be:

1.7977e+308
2.2251e-308

Simulating Overflow and Underflow

Operations exceeding the limits of floating-point representation cause overflow or underflow:


% Overflow and underflow examples 

overflow = realmax * 2;   % Results in Inf 

underflow = realmin / 2;  % Results in 0 (too small to represent) 
 

disp(overflow); 

disp(underflow);

Output will be:

Inf
0

Precision of Floating Point Arithmetic

Demonstrating how small changes can affect results:


% Precision effects 

x = 1.0; 

y = x + eps;        % Adding the smallest possible increment 

z = x + eps/2;      % Adding half the smallest increment 
 

disp(x == y);       % False, as y is slightly larger than x 

disp(x == z);       % True, as the increment is too small to affect x

Output will be:

0
1

Practical Problem: Summing a Large Series

Summing a large series of floating-point numbers can introduce errors due to limited precision:


% Summing a large series 

n = 1e7; 

series = 1 ./ (1:n);  % Harmonic series 

sum_result = sum(series); 
 

disp(sum_result);     % Limited precision affects the result

Output will be:

15.4037 (Approximation for large n)

Useful MATLAB Functions for Floating Point Numbers

Function

Explanation

eps

Returns the precision value of floating point numbers.

realmax

Returns the largest representable floating-point number.

realmin

Returns the smallest positive normalized floating-point number.

isnan

Checks if a value is NaN (Not-a-Number).

isinf

Checks if a value is infinite.

single

Creates a single-precision floating-point number, which uses less memory than double-precision numbers.

double

Creates a double-precision floating-point number, which is the default numeric data type in MATLAB.

long

Not a native MATLAB data type. It is commonly used in other programming languages to represent large integers. MATLAB primarily uses `double` or `int64` for such purposes.

int8

Creates an 8-bit signed integer, which can store values from -128 to 127.

realmax

Returns the largest representable floating-point number in MATLAB for a specific data type (e.g., `realmax(‘single’)` or `realmax` for double).

realmin

Returns the smallest positive normalized floating-point number in MATLAB for a specific data type (e.g., `realmin(‘single’)` or `realmin` for double).

isfloat

Checks if a variable is a floating-point number, returning `true` for both single and double types.

round

Rounds a floating-point number to the nearest integer.

Practice Questions

Test Yourself

1. Create a floating point number using scientific notation and check its class.

2. Add two single precision numbers and compare the result with double precision addition.

3. Write a MATLAB script to find the smallest number that can be added to 1.0 to produce a value greater than 1.0.

4. Experiment with the isnan and isinf functions by dividing numbers by zero.