PySpark Plaso
Release 2019
A tool for distributed extraction of timestamps from various files using extractors adapted from the Plaso engine to Apache Spark.
|
Public Member Functions | |
def | __init__ (self, hdfs_base_uri) |
def | make_hdfs_uri (self, hdfs_path) |
def | strip_hdfs_uri (self, hdfs_path) |
Public Attributes | |
hdfs_base_uri | |
Base-class for controllers.
def plaso.tarzan.app.controllers.controller.Controller.__init__ | ( | self, | |
hdfs_base_uri | |||
) |
Create a new controller that is able to store and utilize HDFS URI. :param hdfs_base_uri: the base HDFS URI to store
Reimplemented in plaso.tarzan.app.controllers.filemancontroller.FileManController.
def plaso.tarzan.app.controllers.controller.Controller.make_hdfs_uri | ( | self, | |
hdfs_path | |||
) |
Get a full HDFS URI by adding a given path to the base HDFS URI. :param hdfs_path: HDFS path :return: HDFS URI
def plaso.tarzan.app.controllers.controller.Controller.strip_hdfs_uri | ( | self, | |
hdfs_path | |||
) |
Get a given HDFS path without its HDFS URI prefix. :param hdfs_path: the HDFS path :return: the HDFS path without the HDFS URI prefix
plaso.tarzan.app.controllers.controller.Controller.hdfs_base_uri |