Academic Block

Logo of Academicblock.net

Floating Point Numbers in MATLAB

Floating point numbers are used in MATLAB to represent non-integer values, including decimals and very large or small numbers. MATLAB follows the IEEE 754 standard for floating-point arithmetic.

Understanding Floating Point Numbers

Floating point numbers consist of three parts:

  • Sign: Indicates whether the number is positive or negative.
  • Exponent: Determines the scale or magnitude of the number.
  • Fraction: Represents the precision of the number.

Creating Floating Point Numbers

Floating point numbers can be created simply by using decimals in MATLAB:

% Examples of floating point numbers
a = 3.14;
b = -0.001;
c = 2.718e3; % Scientific notation, equivalent to 2718

disp(a);
disp(b);
disp(c);
Output will be:

3.14
-0.001
2718
    

Floating Point Precision

MATLAB supports two primary precision types:

  • Single Precision: Uses 4 bytes (32 bits).
  • Double Precision: Uses 8 bytes (64 bits) and is the default in MATLAB.
% Checking precision of numbers
a = single(3.14); % Single precision
b = 3.14; % Double precision

disp(a);
disp(b);
disp(class(a)); % Output: 'single'
disp(class(b)); % Output: 'double'

Common Issues with Floating Point Numbers

Floating point arithmetic is not always exact due to precision limits. For example:

% Floating point precision example
a = 0.1 + 0.2;
disp(a == 0.3); % Returns false
disp(a); % Slightly different from 0.3
Output will be:

0
0.30000000000000004
    

Use the eps function to determine the precision threshold:

% Checking precision threshold
disp(eps); % Smallest difference between two floating-point numbers
disp(eps(1.0)); % Epsilon value near 1
Output will be:

2.2204e-16
2.2204e-16
    

Floating Point Operations

Floating point numbers support standard arithmetic operations:

% Basic arithmetic with floating point numbers
a = 5.5;
b = 2.2;
add = a + b;
sub = a - b;
mul = a * b;
div = a / b;

disp(add);
disp(sub);
disp(mul);
disp(div);
Output will be:

7.7
3.3
12.1
2.5
    

Scientific Notation

Scientific notation is a convenient way to handle very large or small numbers in MATLAB:

% Examples of scientific notation
a = 3.14e2; % Equivalent to 3.14 * 10^2
b = 1.6e-3; % Equivalent to 1.6 * 10^-3

disp(a);
disp(b);
Output will be:

314
0.0016

Handling Special Numbers

MATLAB can handle special floating-point numbers such as infinity and NaN (Not-a-Number):

% Infinity and NaN examples
a = 1/0; % Positive infinity
b = -1/0; % Negative infinity
c = 0/0; % NaN

disp(a);
disp(b);
disp(c);
Output will be:

Inf
-Inf
NaN

Comparing Floating Point Numbers

Direct comparison of floating-point numbers can be unreliable. Use a tolerance instead:

% Comparing floating-point numbers
a = 0.1 + 0.2;
b = 0.3;
tol = 1e-10; % Tolerance

if abs(a - b) < tol
disp('a and b are approximately equal');
else
disp('a and b are not equal');
end
Output will be:

a and b are approximately equal

Rounding Floating Point Numbers

MATLAB provides several functions to round floating-point numbers:

% Rounding examples
a = 3.14159;
round_a = round(a); % Round to nearest integer
floor_a = floor(a); % Round down
ceil_a = ceil(a); % Round up

disp(round_a);
disp(floor_a);
disp(ceil_a);
Output will be:

3
3
4

Floating Point Limits

Understanding the limits of floating-point representation is essential:

% Checking floating-point limits
max_val = realmax; % Largest representable floating-point number
min_val = realmin; % Smallest positive normalized floating-point number

disp(max_val);
disp(min_val);
Output will be:

1.7977e+308
2.2251e-308

Simulating Overflow and Underflow

Operations exceeding the limits of floating-point representation cause overflow or underflow:

% Overflow and underflow examples
overflow = realmax * 2; % Results in Inf
underflow = realmin / 2; % Results in 0 (too small to represent)

disp(overflow);
disp(underflow);
Output will be:

Inf
0

Precision of Floating Point Arithmetic

Demonstrating how small changes can affect results:

% Precision effects
x = 1.0;
y = x + eps; % Adding the smallest possible increment
z = x + eps/2; % Adding half the smallest increment

disp(x == y); % False, as y is slightly larger than x
disp(x == z); % True, as the increment is too small to affect x
Output will be:

0
1

Practical Problem: Summing a Large Series

Summing a large series of floating-point numbers can introduce errors due to limited precision:

% Summing a large series
n = 1e7;
series = 1 ./ (1:n); % Harmonic series
sum_result = sum(series);

disp(sum_result); % Limited precision affects the result
Output will be:

15.4037 (Approximation for large n)

Useful MATLAB Functions for Floating Point Numbers

Function
Explanation
eps
Returns the precision value of floating point numbers.
realmax
Returns the largest representable floating-point number.
realmin
Returns the smallest positive normalized floating-point number.
isnan
Checks if a value is NaN (Not-a-Number).
isinf
Checks if a value is infinite.
single
Creates a single-precision floating-point number, which uses less memory than double-precision numbers.
double
Creates a double-precision floating-point number, which is the default numeric data type in MATLAB.
long
Not a native MATLAB data type. It is commonly used in other programming languages to represent large integers. MATLAB primarily uses `double` or `int64` for such purposes.
int8
Creates an 8-bit signed integer, which can store values from -128 to 127.
realmax
Returns the largest representable floating-point number in MATLAB for a specific data type (e.g., `realmax(‘single’)` or `realmax` for double).
realmin
Returns the smallest positive normalized floating-point number in MATLAB for a specific data type (e.g., `realmin(‘single’)` or `realmin` for double).
isfloat
Checks if a variable is a floating-point number, returning `true` for both single and double types.
round
Rounds a floating-point number to the nearest integer.

Practice Questions

Test Yourself

1. Create a floating point number using scientific notation and check its class.

2. Add two single precision numbers and compare the result with double precision addition.

3. Write a MATLAB script to find the smallest number that can be added to 1.0 to produce a value greater than 1.0.

4. Experiment with the isnan and isinf functions by dividing numbers by zero.