lidar image

Lidar remote sensing data is a point type data which contains the X, Y, Z coordinates along with its other features like return number, classification code, etc. Lidar data is stored in different formats like .las, .laz, .pts, .xyz etc. One of the frequently used format is LAS format. Refer to the document having details of different versions of las format.

LAS formats contains two parts header and points data. Header contains information about the data like metadata information and points data contains information about each point data.

Laspy is a library to read and write the las data. It can be installed using the command:

conda install -c conda-forge laspy

import laspy
import numpy as np
from mpl_toolkits import mplot3d

Coming to the data part, there are many organizations providing the data freely. Some of the sites are NOAA (National Oceanic and Atmospheric Administration) which have large Lidar data set of USA coastal regions, USGS provides the Lidar data of USA, United States Inter agency Elevation Inventory (USIEI) , Open Topography which is a public domain where users upload data for public, GeoSUD (some areas of Europe, Asia, Africa and South America), LIPAD (Phillipines Lidar data), etc. We can use that data for the research and analysis purpose.

I have downloaded the data from the NOAA. The data represents a small subset of the New York, USA.

The Las data is opened using the laspy.file.File object in the read mode. It can also be opened in readwrite mode. When a file is opened in read mode, laspy first reads the header, and then maps the point records with numpy. The code for reading is shown below:

Reading and plotting Lidar data

inFile = laspy.file.File("./Job537118_ny2014_usgs_cmgp_sandy_longisland_m4938_sample.las", mode="r")
import matplotlib.pyplot as plt
%matplotlib inline
fig = plt.figure(figsize=[20, 5])
ax = plt.axes(projection='3d')
sc = ax.scatter(inFile.x, inFile.y, inFile.z, c=inFile.z ,s=0.1, marker='o', cmap="Spectral")

The Header and Point format in the las file can be accessed using the laspy object as shown below. The details of each field is explained in the document by ASPRS, the journal. 

#Header format
headerformat = inFile.header.header_format
for spec in headerformat:
    print(, end = ', '),
file_sig, file_source_id, global_encoding, proj_id_1, proj_id_2, proj_id_3, proj_id_4, version_major, version_minor, system_id, software_id, created_day, created_year, header_size, data_offset, num_variable_len_recs, data_format_id, data_record_length, point_records_count, point_return_count, x_scale, y_scale, z_scale, x_offset, y_offset, z_offset, x_max, x_min, y_max, y_min, z_max, z_min, 

# Point format 
pointformat = inFile.point_format
for spec in inFile.point_format:


Accessing Header information: Points information are accessible as properties of the main file( inFile) object, and header attributes are accessible via the header property of the main file (inFile.header) object. Click for a reference of field names to access the information.

Fetching Header and Point information

# Header Information
Min = inFile.header.min 
print(Min)  # info about min.x,min.y,min.z

[1119491.79, 181916.78, -16.37]

Max = inFile.header.max
print(Max) # info about max.x,max.y,max.z

[1123228.21, 184204.8, 101.64]

#Point information inFile.points # info of all data about the points with the attributes

array([((2220022, 8419921, 1913, 34, 9, 17, -17, 0, 32, 80836793.24317789),), ((2220022, 8419910, 1890, 29, 9, 18, -17, 0, 32, 80836793.24318624),), ((2220020, 8419918, 1913, 36, 9, 17, -17, 0, 32, 80836793.24321151),),
((2070960, 8192369, 2622, 20, 9, 17, 14, 0, 34, 80857781.42800069),), ((2070960, 8192355, 2632, 14, 9, 17, 14, 0, 34, 80857781.42800903),), ((2070963, 8192336, 2592, 13, 9, 17, 14, 0, 34, 80857781.42802572),)],
dtype=[(‘point’, [(‘X’, ‘<i4’), (‘Y’, ‘<i4’), (‘Z’, ‘<i4’), (‘intensity’, ‘<u2’), (‘flag_byte’, ‘u1’), (‘raw_classification’, ‘u1’), (‘scan_angle_rank’, ‘i1’), (‘user_data’, ‘u1’), (‘pt_src_id’, ‘<u2’), (‘gps_time’, ‘<f8’)])])

X = inFile.x  # only X data

[1122200.22 1122200.22 1122200.2 … 1120709.6 1120709.6 1120709.63]

Y = inFile.y # only Y data print(Y)

[184199.21 184199.1 184199.18 … 181923.69 181923.55 181923.36]

Z = inFile.z # only Y data

[19.13 18.9 19.13 … 26.22 26.32 25.92]

Finding points in neighborhood

With the help of these header and point attributes we can get stack the input point data and perform operations on it to extract valueble information. For example as shown below the X, Y, Z coordinates of the points are stacked into numpy array format and it is convenient to use with any python libraries to extract information. Like neighborhood of each point can be found by dumping the data into the packages like kdtree, FLANN etc.
coords = np.vstack((inFile.x, inFile.y,inFile.z)).transpose()

array([1.12220022e+06, 1.84199210e+05, 1.91300000e+01]



from pyflann import *
flann = FLANN()
result,dists = flann.nn(coords, coords[100,], num_neighbors = 5)
#neighbors = pf.nn(dataset, dataset[100,], num_neighbors = 5)
print("Five nearest neighbors of point 100: ")
print("Distances: ")

Five nearest neighbors of point 100:
[[100 114 113 112 111]]
[[0. 0.0251 0.0565 0.0734 0.1098]]

Using KDTree

from scipy import spatial

tree = spatial.KDTree(coords)
tree.query(coords[100,], k = 5)

(array([0. , 0.1584298 , 0.23769729, 0.27092434, 0.31968735]),
array([100, 114, 113, 112, 102]))

Clipping Lidar data

And also we can clip the data based on the point properties such as based on classification or based on return value, based on the height (Z) etc.

Based on the return

num_returns = inFile.num_returns
return_num = inFile.return_num
ground_points = inFile.points[num_returns == return_num]

print("%i points out of %i were last return points." % (len(ground_points),

2300354 points out of 2636371 were last return points.

Based on classification

We can also store these subset or modified point data in the LAS format using the write mode. The sample code is shown below:

from laspy.file import File
I = inFile.Classification == 2
outFile = File('groundclass_subset_output.las', mode='w', header=inFile.header)
outFile.points = inFile.points[I]
# reading the ground class las file
inFile2 = laspy.file.File("./groundclass_subset_output.las", mode="r")
# Displaying the ground classlasfile
fig = plt.figure(figsize=[20, 5])
ax = plt.axes(projection='3d')
sc=ax.scatter(inFile2.x, inFile2.y, inFile2.z,c=inFile2.z,s=0.1,marker='o',cmap="Spectral")
plt.title('Ground data')

Based on height

from laspy.file import File
I = inFile.z > 0
outFile = File('las_data_height_output.las', mode='w', header=inFile.header)
outFile.points = inFile.points[I]

[ True True True … True True True]

# reading the modified las based on height file
inFile3 = laspy.file.File("./las_data_height_output.las", mode="r")

#Displaying the las file
fig = plt.figure(figsize=[20, 5])
ax = plt.axes(projection='3d')
sc = ax.scatter(inFile3.x, inFile3.y, inFile3.z,c=inFile3.z,s=0.1,marker='o',cmap="Spectral")
plt.title('Point subsets having height greater than zero')


Changing existing Lidar data

We can also modify the point and header information if we open the file using read write mode. For this, first we make a copy of our input data.
outFile = File('duplicate_data.las', mode='w', header=inFile.header)
outFile.points = inFile.points

Here we make changes in classification. Sample code is shown below:

inFile_dup = laspy.file.File("duplicate_data.las", mode = "rw")
arr1 = inFile_dup.classification
inFile_dup.classification = np.zeros((arr1.shape[0]), dtype=int)

arr2 = inFile_dup.classification

[ 1 2 7 17 18]

For any queries related to this blog, you can reach me at LinkedIn
Stay tuned for interesting articles ahead!

Leave A Comment

All fields marked with an asterisk (*) are required