Metadata-Version: 2.1 Name: pdal Version: 3.2.2 Summary: Point cloud data processing Home-page: https://pdal.io Author: Howard Butler Author-email: howard@hobu.co Maintainer: Howard Butler Maintainer-email: howard@hobu.co License: BSD Keywords: point cloud spatial Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: Intended Audience :: Science/Research Classifier: License :: OSI Approved :: BSD License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Topic :: Scientific/Engineering :: GIS Description-Content-Type: text/x-rst License-File: LICENSE.txt Requires-Dist: numpy ================================================================================ PDAL ================================================================================ PDAL Python support allows you to process data with PDAL into `Numpy`_ arrays. It provides a PDAL extension module to control Python interaction with PDAL. Additionally, you can use it to fetch `schema`_ and `metadata`_ from PDAL operations. Installation -------------------------------------------------------------------------------- PyPI ................................................................................ PDAL Python support is installable via PyPI: .. code-block:: pip install PDAL GitHub ................................................................................ The repository for PDAL's Python extension is available at https://github.com/PDAL/python Python support released independently from PDAL itself as of PDAL 1.7. Usage -------------------------------------------------------------------------------- Simple ................................................................................ Given the following pipeline, which simply reads an `ASPRS LAS`_ file and sorts it by the ``X`` dimension: .. _`ASPRS LAS`: https://www.asprs.org/committee-general/laser-las-file-format-exchange-activities.html .. code-block:: python json = """ { "pipeline": [ "1.2-with-color.las", { "type": "filters.sort", "dimension": "X" } ] }""" import pdal pipeline = pdal.Pipeline(json) count = pipeline.execute() arrays = pipeline.arrays metadata = pipeline.metadata log = pipeline.log Programmatic Pipeline Construction ................................................................................ The previous example specified the pipeline as a JSON string. Alternatively, a pipeline can be constructed by creating ``Stage`` instances and piping them together. For example, the previous pipeline can be specified as: .. code-block:: python pipeline = pdal.Reader("1.2-with-color.las") | pdal.Filter.sort(dimension="X") Stage Objects ============= - A stage is an instance of ``pdal.Reader``, ``pdal.Filter`` or ``pdal.Writer``. - A stage can be instantiated by passing as keyword arguments the options applicable to the respective PDAL stage. For more on PDAL stages and their options, check the PDAL documentation on `Stage Objects `__. - The ``filename`` option of ``Readers`` and ``Writers`` as well as the ``type`` option of ``Filters`` can be passed positionally as the first argument. - The ``inputs`` option specifies a sequence of stages to be set as input to the current stage. Each input can be either the string tag of another stage, or the ``Stage`` instance itself. - The ``Reader``, ``Filter`` and ``Writer`` classes come with static methods for all the respective PDAL drivers. For example, ``pdal.Filter.head()`` is a shortcut for ``pdal.Filter(type="filters.head")``. These methods are auto-generated by introspecting ``pdal`` and the available options are included in each method's docstring: .. code-block:: >>> help(pdal.Filter.head) Help on function head in module pdal.pipeline: head(**kwargs) Return N points from beginning of the point cloud. user_data: User JSON log: Debug output filename option_file: File from which to read additional options where: Expression describing points to be passed to this filter where_merge='auto': If 'where' option is set, describes how skipped points should be merged with kept points in standard mode. count='10': Number of points to return from beginning. If 'invert' is true, number of points to drop from the beginning. invert='false': If true, 'count' specifies the number of points to skip from the beginning. Pipeline Objects ================ A ``pdal.Pipeline`` instance can be created from: - a JSON string: ``Pipeline(json_string)`` - a sequence of ``Stage`` instances: ``Pipeline([stage1, stage2])`` - a single ``Stage`` with the ``Stage.pipeline`` method: ``stage.pipeline()`` - nothing: ``Pipeline()`` creates a pipeline with no stages. - joining ``Stage`` and/or other ``Pipeline`` instances together with the pipe operator (``|``): - ``stage1 | stage2`` - ``stage1 | pipeline1`` - ``pipeline1 | stage1`` - ``pipeline1 | pipeline2`` Every application of the pipe operator creates a new ``Pipeline`` instance. To update an existing ``Pipeline`` use the respective in-place pipe operator (``|=``): .. code-block:: python # update pipeline in-place pipeline = pdal.Pipeline() pipeline |= stage pipeline |= pipeline2 Reading using Numpy Arrays ................................................................................ The following more complex scenario demonstrates the full cycling between PDAL and Python: * Read a small testfile from GitHub into a Numpy array * Filters the array with Numpy for Intensity * Pass the filtered array to PDAL to be filtered again * Write the final filtered array to a LAS file and a TileDB_ array via the `TileDB-PDAL integration`_ using the `TileDB writer plugin`_ .. code-block:: python import pdal data = "https://github.com/PDAL/PDAL/blob/master/test/data/las/1.2-with-color.las?raw=true" pipeline = pdal.Reader.las(filename=data).pipeline() print(pipeline.execute()) # 1065 points # Get the data from the first array # [array([(637012.24, 849028.31, 431.66, 143, 1, # 1, 1, 0, 1, -9., 132, 7326, 245380.78254963, 68, 77, 88), # dtype=[('X', ' 30] print(len(intensity)) # 704 points # Now use pdal to clamp points that have intensity 100 <= v < 300 pipeline = pdal.Filter.range(limits="Intensity[100:300)").pipeline(intensity) print(pipeline.execute()) # 387 points clamped = pipeline.arrays[0] # Write our intensity data to a LAS file and a TileDB array. For TileDB it is # recommended to use Hilbert ordering by default with geospatial point cloud data, # which requires specifying a domain extent. This can be determined automatically # from a stats filter that computes statistics about each dimension (min, max, etc.). pipeline = pdal.Writer.las( filename="clamped.las", offset_x="auto", offset_y="auto", offset_z="auto", scale_x=0.01, scale_y=0.01, scale_z=0.01, ).pipeline(clamped) pipeline |= pdal.Filter.stats() | pdal.Writer.tiledb(array_name="clamped") print(pipeline.execute()) # 387 points # Dump the TileDB array schema import tiledb with tiledb.open("clamped") as a: print(a.schema) Executing Streamable Pipelines ................................................................................ Streamable pipelines (pipelines that consist exclusively of streamable PDAL stages) can be executed in streaming mode via ``Pipeline.iterator()``. This returns an iterator object that yields Numpy arrays of up to ``chunk_size`` size (default=10000) at a time. .. code-block:: python import pdal pipeline = pdal.Reader("test/data/autzen-utm.las") | pdal.Filter.range(limits="Intensity[80:120)") for array in pipeline.iterator(chunk_size=500): print(len(array)) # or to concatenate all arrays into one # full_array = np.concatenate(list(pipeline)) ``Pipeline.iterator()`` also takes an optional ``prefetch`` parameter (default=0) to allow prefetching up to to this number of arrays in parallel and buffering them until they are yielded to the caller. If you just want to execute a streamable pipeline in streaming mode and don't need to access the data points (typically when the pipeline has Writer stage(s)), you can use the ``Pipeline.execute_streaming(chunk_size)`` method instead. This is functionally equivalent to ``sum(map(len, pipeline.iterator(chunk_size)))`` but more efficient as it avoids allocating and filling any arrays in memory. Accessing Mesh Data ................................................................................ Some PDAL stages (for instance ``filters.delaunay``) create TIN type mesh data. This data can be accessed in Python using the ``Pipeline.meshes`` property, which returns a ``numpy.ndarray`` of shape (1,n) where n is the number of Triangles in the mesh. If the PointView contains no mesh data, then n = 0. Each Triangle is a tuple ``(A,B,C)`` where A, B and C are indices into the PointView identifying the point that is the vertex for the Triangle. Meshio Integration ................................................................................ The meshes property provides the face data but is not easy to use as a mesh. Therefore, we have provided optional Integration into the `Meshio `__ library. The ``pdal.Pipeline`` class provides the ``get_meshio(idx: int) -> meshio.Mesh`` method. This method creates a `Mesh` object from the `PointView` array and mesh properties. .. note:: The meshio integration requires that meshio is installed (e.g. ``pip install meshio``). If it is not, then the method fails with an informative RuntimeError. Simple use of the functionality could be as follows: .. code-block:: python import pdal ... pl = pdal.Pipeline(pipeline) pl.execute() mesh = pl.get_meshio(0) mesh.write('test.obj') Advanced Mesh Use Case ................................................................................ USE-CASE : Take a LiDAR map, create a mesh from the ground points, split into tiles and store the tiles in PostGIS. .. note:: Like ``Pipeline.arrays``, ``Pipeline.meshes`` returns a list of ``numpy.ndarray`` to provide for the case where the output from a Pipeline is multiple PointViews (example using 1.2-with-color.las and not doing the ground classification for clarity) .. code-block:: python import pdal import psycopg2 import io pl = ( pdal.Reader(".../python/test/data/1.2-with-color.las") | pdal.Filter.splitter(length=1000) | pdal.Filter.delaunay() ) pl.execute() conn = psycopg(%CONNNECTION_STRING%) buffer = io.StringIO for idx in range(len(pl.meshes)): m = pl.get_meshio(idx) if m: m.write(buffer, file_format = "wkt") with conn.cursor() as curr: curr.execute( "INSERT INTO %table-name% (mesh) VALUES (ST_GeomFromEWKT(%(ewkt)s)", { "ewkt": buffer.getvalue()} ) conn.commit() conn.close() buffer.close() .. _`Numpy`: http://www.numpy.org/ .. _`schema`: http://www.pdal.io/dimensions.html .. _`metadata`: http://www.pdal.io/development/metadata.html .. _`TileDB`: https://tiledb.com/ .. _`TileDB-PDAL integration`: https://docs.tiledb.com/geospatial/pdal .. _`TileDB writer plugin`: https://pdal.io/stages/writers.tiledb.html .. image:: https://github.com/PDAL/python/workflows/Build/badge.svg :target: https://github.com/PDAL/python/actions?query=workflow%3ABuild Requirements ================================================================================ * PDAL 2.4+ * Python >=3.7 * Pybind11 (eg :code:`pip install pybind11[global]`) * Numpy (eg :code:`pip install numpy`) * scikit-build (eg :code:`pip install scikit-build`) Changes -------------------------------------------------------------------------------- 3.2.2 ................................................................................ * Implement move ctor to satisfy MSVC 2019 https://github.com/PDAL/python/commit/667f56bd0ee465f55a14636986e80b0a9cefcf14 3.2.1 ................................................................................ * implement #129, add pandas DataFrame i/o for convenience by @hobu in https://github.com/PDAL/python/pull/130 * harden getMetadata and related calls from getting non-utf-8 'json' by @hobu in https://github.com/PDAL/python/pull/140 * ignore DataFrame test if not GeoPandas, give up on Python 3.7 builds by @hobu in https://github.com/PDAL/python/pull/137 3.2.0 ................................................................................ * PDAL base library 2.4.0+ is required * CMake project name updated to pdal-python * `srswkt2` property added to allow fetching of SRS info * pip builds require cmake >= 3.11 * CMAKE_CXX_STANDARD set to c++17 to match PDAL 2.4.x * Driver and options *actually* uses the library instead of shelling out to `pdal` application :) * _get_json renamed to toJSON and made public * Fix #119, 'json' optional kwarg put back for now * DEVELOPMENT_COMPONENT in CMake FindPython skipped on OSX * Make sure 'type' gets set when serializing to JSON 3.1.0 ................................................................................ * **Breaking change** – pipeline.metadata now returns a dictionary from json.loads instead of a string. * pipeline.quickinfo will fetch the PDAL preview() information for a data source. You can use this to fetch header or other information without reading data. https://github.com/PDAL/python/pull/109 * PDAL driver and option collection now uses the PDAL library directly rather than shelling out to the pdal command https://github.com/PDAL/python/pull/107 * Pipelines now support pickling for use with things like Dask https://github.com/PDAL/python/pull/110 3.0.0 ................................................................................ * Pythonic pipeline creation https://github.com/PDAL/python/pull/91 * Support streaming pipeline execution https://github.com/PDAL/python/pull/94 * Replace Cython with PyBind11 https://github.com/PDAL/python/pull/102 * Remove pdal.pio module https://github.com/PDAL/python/pull/101 * Move readers.numpy and filters.python to separate repository https://github.com/PDAL/python/pull/104 * Miscellaneous refactorings and cleanups 2.3.5 ................................................................................ * Fix memory leak https://github.com/PDAL/python/pull/74 * Handle metadata with invalid unicode by erroring https://github.com/PDAL/python/pull/74 2.3.0 ................................................................................ * PDAL Python support 2.3.0 requires PDAL 2.1+. Older PDAL base libraries likely will not work. * Python support built using scikit-build * readers.numpy and filters.python are installed along with the extension. * Pipeline can take in a list of arrays that are passed to readers.numpy * readers.numpy now supports functions that return arrays. See https://pdal.io/stages/readers.numpy.html for more detail. 2.0.0 ................................................................................ * PDAL Python extension is now in its own repository on its own release schedule at https://github.com/PDAL/python * Extension now builds and works under PDAL OSGeo4W64 on Windows.