[Forensics-changes] [yara] 208/407: Update documentation

Hilko Bengen bengen at moszumanska.debian.org
Sat Jul 1 10:28:26 UTC 2017


This is an automated email from the git hooks/post-receive script.

bengen pushed a commit to annotated tag v3.3.0
in repository yara.

commit 72136d8effc35be925ac8154dcbcf955c4d7f5bb
Author: Victor M. Alvarez <plusvic at gmail.com>
Date:   Fri Nov 7 15:59:53 2014 +0100

    Update documentation
---
 docs/gettingstarted.rst |   9 +++-
 docs/modules.rst        |   9 ++--
 docs/modules/cuckoo.rst |   7 ++++
 docs/modules/elf.rst    |   2 +
 docs/modules/hash.rst   |  53 +++++++++++++++++++++++
 docs/modules/magic.rst  |   2 +
 docs/modules/pe.rst     | 109 +++++++++++++++++++++++++++++++++++++++++++++---
 docs/writingmodules.rst |  61 ++++++++++++++++++++-------
 docs/writingrules.rst   |  36 +++++++++++-----
 9 files changed, 251 insertions(+), 37 deletions(-)

diff --git a/docs/gettingstarted.rst b/docs/gettingstarted.rst
index ca1903c..0ddc74b 100644
--- a/docs/gettingstarted.rst
+++ b/docs/gettingstarted.rst
@@ -23,6 +23,14 @@ way::
     make
     sudo make install
 
+Some YARA's features depends on the OpenSSL library. Those features are
+built into YARA only if you have the OpenSSL library installed in your
+system. The ``configure`` script will automatically detect if OpenSSL is
+installed or not. If you want to make sure that YARA is built with
+OpenSSL-dependant features you must pass ``--with-crypto`` to the ``configure``
+script.
+
+
 The following modules are not copiled into YARA by default:
 
 * cuckoo
@@ -37,7 +45,6 @@ For example::
     ./configure --enable-magic
     ./configure --enable-cuckoo --enable-magic
 
-
 Modules usually depends on external libraries, depending on the modules you
 choose to install you'll need the following libraries:
 
diff --git a/docs/modules.rst b/docs/modules.rst
index faf62a8..869c709 100644
--- a/docs/modules.rst
+++ b/docs/modules.rst
@@ -12,10 +12,11 @@ the :ref:`writing-modules` section.
 .. toctree::
    :maxdepth: 3
 
-   modules/pe
-   modules/elf
-   modules/cuckoo
-   modules/magic
+   PE <modules/pe>
+   ELF <modules/elf>
+   Cuckoo <modules/cuckoo>
+   Magic <modules/magic>
+   Hash <modules/hash>
 
 
 
diff --git a/docs/modules/cuckoo.rst b/docs/modules/cuckoo.rst
index ceee712..f6f929f 100644
--- a/docs/modules/cuckoo.rst
+++ b/docs/modules/cuckoo.rst
@@ -108,6 +108,13 @@ Reference
         Similar to :func:`http_request`, but only takes into account POST
         requests.
 
+    .. function:: dns_lookup(regexp)
+
+        Function returning true if the program sent a domain name resolution
+        request for a domain matching the provided regular expression.
+
+        *Example: cuckoo.network.dns_lookop(/evil\\.com/)*
+
 .. type:: registry
 
     .. function:: key_access(regexp)
diff --git a/docs/modules/elf.rst b/docs/modules/elf.rst
index 208ee81..d79be86 100644
--- a/docs/modules/elf.rst
+++ b/docs/modules/elf.rst
@@ -5,6 +5,8 @@
 ELF module
 ##########
 
+.. versionadded:: 3.2.0
+
 The ELF module is very similar to the :ref:`pe-module`, but for ELF files. This
 module exposes most of the fields present in a ELF header. Let's see some
 examples::
diff --git a/docs/modules/hash.rst b/docs/modules/hash.rst
new file mode 100644
index 0000000..2ec0861
--- /dev/null
+++ b/docs/modules/hash.rst
@@ -0,0 +1,53 @@
+
+.. _hash-module:
+
+###########
+Hash module
+###########
+
+.. versionadded:: 3.2.0
+
+The Hash module allows you to calculate hashes (MD5, SHA1, SHA256) from portions
+of your file and create signatures based on those hashes.
+
+.. important::
+    This module depends on the OpenSSL library. Please refer to
+    :ref:`compiling-yara` for information about how to build OpenSSL-dependant
+    features into YARA.
+
+    Good news for Windows users: this module is already included in the official
+    Windows binaries.
+
+.. c:function:: md5(offset, size)
+
+    Returns the MD5 hash for *size* bytes starting at *offset*. When scanning a
+    running process the *offset* argument should be a virtual address within
+    the process address space. The returned string is always in lowercase.
+
+    *Example: hash.md5(0, filesize) == "feba6c919e3797e7778e8f2e85fa033d"*
+
+.. c:function:: md5(string)
+
+    Returns the MD5 hash for the given string.
+
+    *Example: hash.md5("dummy") == "275876e34cf609db118f3d84b799a790"*
+
+.. c:function:: sha1(offset, size)
+
+    Returns the SHA1 hash for *size* bytes starting at *offset*. When scanning a
+    running process the *offset* argument should be a virtual address within
+    the process address space. The returned string is always in lowercase.
+
+.. c:function:: sha1(string)
+
+    Returns the SHA1 hash for the given string.
+
+.. c:function:: sha256(offset, size)
+
+    Returns the SHA256 hash for *size* bytes starting at *offset*. When scanning a
+    running process the *offset* argument should be a virtual address within
+    the process address space. The returned string is always in lowercase.
+
+.. c:function:: sha256(string)
+
+    Returns the SHA256 hash for the given string.
\ No newline at end of file
diff --git a/docs/modules/magic.rst b/docs/modules/magic.rst
index 5f4b152..ccec20a 100644
--- a/docs/modules/magic.rst
+++ b/docs/modules/magic.rst
@@ -5,6 +5,8 @@
 Magic module
 ############
 
+.. versionadded:: 3.1.0
+
 The Magic module allows you to identify the type of the file based on the
 output of `file <http://en.wikipedia.org/wiki/File_(command)>`_, the standard
 Unix command.
diff --git a/docs/modules/pe.rst b/docs/modules/pe.rst
index 6d864a7..94a0ea4 100644
--- a/docs/modules/pe.rst
+++ b/docs/modules/pe.rst
@@ -159,23 +159,23 @@ Reference
 
         Section name.
 
-    .. c:type:: characteristics
+    .. c:member:: characteristics
 
         Section characteristics.
 
-    .. c:type:: virtual_address
+    .. c:member:: virtual_address
 
         Section virtual address.
 
-    .. c:type:: virtual_size
+    .. c:member:: virtual_size
 
         Section virtual size.
 
-    .. c:type:: raw_data_offset
+    .. c:member:: raw_data_offset
 
         Section raw offset.
 
-    .. c:type:: raw_data_size
+    .. c:member:: raw_data_size
 
         Section raw size.
 
@@ -202,6 +202,91 @@ Reference
 
     *Example:  pe.version_info["CompanyName"] contains "Microsoft"*
 
+    .. versionadded:: 3.2.0
+
+.. c:type:: number_of_signatures
+
+    Number of authenticode signatures in the PE.
+
+.. c:type:: signatures
+
+    An zero-based array of signature objects, one for each authenticode
+    signature in the PE file. Usually PE files have a single signature.
+
+    .. c:member:: issuer
+
+        A string containing information about the issuer. These are some
+        examples::
+
+            "/C=US/ST=Washington/L=Redmond/O=Microsoft Corporation/CN=Microsoft Code Signing PCA"
+
+            "/C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=Terms of use at https://www.verisign.com/rpa (c)10/CN=VeriSign Class 3 Code Signing 2010 CA"
+
+            "/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO Code Signing CA 2"
+
+    .. c:member:: subject
+
+        A string containing information about the subject.
+
+    .. c:member:: version
+
+        Version number.
+
+    .. c:member:: algorithm
+
+        Algorithm used for this signature. Usually "sha1WithRSAEncryption".
+
+    .. c:member:: serial
+
+        A string containing the serial number. This is an example::
+
+        "52:00:e5:aa:25:56:fc:1a:86:ed:96:c9:d4:4b:33:c7"
+
+    .. c:member:: not_before
+
+        Unix timestamp on which validity period for this signature begins.
+
+    .. c:member:: not_after
+
+        Unix timestamp on which validity period for this signature ends.
+
+    .. c:member:: valid_on(timestamp)
+
+        Function returning true if the signature was valid the on date
+        indicated by *timestamp*. The following sentence::
+
+            pe.signature[n].valid_on(timestamp)
+
+        Is equivalent to::
+
+            timestamp >= pe.signature[n].not_before and timestamp <= pe.signature[n].not_after
+
+.. c:type:: rich_signature
+
+    Structure containing information about PE's rich signature as documented
+    `here <http://www.ntcore.com/files/richsign.htm>`_.
+
+    .. c:member:: offset
+
+        Offset where the rich signature starts. It will be undefined if the
+        file doesn't have a rich signature.
+
+    .. c:member:: length
+
+        Length of the rich signature, not including the final "Rich" marker.
+
+    .. c:member:: key
+
+        Key used to encrypt the data with XOR.
+
+    .. c:member:: raw_data
+
+        Raw data as it appears in the file.
+
+    .. c:member:: clear_data
+
+        Data after being decrypted by XORing it with the key.
+
 .. c:function:: exports(function_name)
 
     Function returning true if the PE exports *function_name* or
@@ -225,6 +310,8 @@ Reference
 
     *Example: pe.locale(0x0419) // Russian (RU)*
 
+    .. versionadded:: 3.2.0
+
 .. c:function:: language(language_identifier)
 
     Function returning true if the PE has a resource with the specified language
@@ -234,3 +321,15 @@ Reference
 
     *Example: pe.language(0x0A) // Spanish*
 
+    .. versionadded:: 3.2.0
+
+.. c:function:: imphash()
+
+    Function returning the import hash or imphash for the PE. The imphash is
+    a MD5 hash of the PE's import table after some normalization. The imphash
+    for a PE can be also computed with `pefile <http://code.google.com/p/pefile/>`_ and you can find more information in
+    `Mandiant's blog <https://www.mandiant.com/blog/tracking-malware-import-hashing/>`_.
+
+    *Example: pe.imphash() == "b8bb385806b89680e13fc0cf24f4431e"*
+
+    .. versionadded:: 3.2.0
diff --git a/docs/writingmodules.rst b/docs/writingmodules.rst
index 85b139b..14f2d8b 100644
--- a/docs/writingmodules.rst
+++ b/docs/writingmodules.rst
@@ -365,6 +365,8 @@ when you start initializing its values.
 Dictionaries
 ------------
 
+.. versionadded:: 3.2.0
+
 You can also declare dictionaries of integers, strings, or structures::
 
     begin_declarations;
@@ -445,6 +447,20 @@ the declaration section, like this::
       ...your code here
     }
 
+Functions can be overloaded as in C++ and other programming languages. You can
+declare two functions with the same name as long as they differ in the type or
+number of arguments. One example of overloaded functions can be found in the
+:ref:`hash-module`, it has two functions for calculating MD5 hashes, one
+receiving an offset and length within the file and another one receiving a
+string::
+
+    begin_declarations;
+
+        declare_function("md5", "ii", "s", data_md5);
+        declare_function("md5", "s", "s", string_md5);
+
+    end_declarations;
+
 We are going to discuss function implementation more in depth in the
 :ref:`implementing-functions` section.
 
@@ -476,8 +492,6 @@ structure it may need, but most of the times they are just empty functions:
 
 Any returned value different from ``ERROR_SUCCESS`` will abort YARA's execution.
 
-
-
 Implementing the module's logic
 ===============================
 
@@ -634,9 +648,9 @@ declared in the declarations section, once you've parsed or analized the scanned
 data and/or any additional module's data. This is done by using the
 ``set_integer`` and ``set_string`` functions:
 
-.. c:function:: void set_integer(int64_t value, YR_OBJECT* object, char* field, ...)
+.. c:function:: void set_integer(int64_t value, YR_OBJECT* object, const char* field, ...)
 
-.. c:function:: void set_string(char* value, YR_OBJECT* object, char* field, ...)
+.. c:function:: void set_string(const char* value, YR_OBJECT* object, const char* field, ...)
 
 Both functions receive a value to be assigned to the variable, a pointer to a
 ``YR_OBJECT`` representing the variable itself or some ancestor of
@@ -751,13 +765,13 @@ implementation of your functions to retrieve values previously stored by
 ``module_load``.
 
 
-.. c:function:: int64_t get_integer(YR_OBJECT* object, char* field, ...)
+.. c:function:: int64_t get_integer(YR_OBJECT* object, const char* field, ...)
 
-.. c:function:: char* get_string(YR_OBJECT* object, char* field, ...)
+.. c:function:: char* get_string(YR_OBJECT* object, const char* field, ...)
 
 There's also a function to the get any ``YR_OBJECT`` in the objects tree:
 
-.. c:function:: YR_OBJECT* get_object(YR_OBJECT* object, char* field, ...)
+.. c:function:: YR_OBJECT* get_object(YR_OBJECT* object, const char* field, ...)
 
 Here goes a little exam...
 
@@ -839,22 +853,37 @@ Function arguments
 ------------------
 
 Within the function's code you get its arguments by using
-``integer_argument(n)``, ``string_argument(n)`` or ``regexp_argument(n)``
-depending on the type of the argument, where *n* is the 1-based argument's
-number.
+``integer_argument(n)``, ``regexp_argument(n)``, ``string_argument(n)`` or
+``sized_string_argument(n)`` depending on the type of the argument, where
+*n* is the 1-based argument's number.
 
-If your function receives a string, a regular expression and an integer in that
-order, you can get their values with:
+``string_argument(n)`` can be used when your function expects to receive a
+NULL-terminated C string, if your function can receive arbitrary binary data
+possibly containing NULL characters you must use ``sized_string_argument(n)``.
+
+Here you have some examples:
 
 .. code-block:: c
 
-    char* arg_1 = string_argument(1);
+    int64_t arg_1 = integer_argument(1);
     RE_CODE arg_2 = regexp_argument(2);
-    int64_t arg_3 = integer_argument(3);
+    char* arg_3 = string_argument(3);
+    SIZED_STRING* arg_4 = sized_string_argument(4);
+
+The C type for integer arguments is ``int64_t``, for regular expressions is
+``RE_CODE``, for NULL-terminated strings is ``char*`` and for string possibly
+contaning NULL characters is ``SIZED_STRING*``. ``SIZED_STRING*``.
+``SIZED_STRING`` structures have the following attributes:
+
+.. c:type:: SIZED_STRING
+
+    .. c:member:: length
+
+        String's length.
 
+    .. c:member:: c_string
 
-Notice that the C type for integer arguments is ``int64_t`` and for regular
-expressions is ``RE_CODE``.
+       ``char*`` pointing to the string content.
 
 Return values
 -------------
diff --git a/docs/writingrules.rst b/docs/writingrules.rst
index 8ac8ca2..ffca112 100644
--- a/docs/writingrules.rst
+++ b/docs/writingrules.rst
@@ -42,22 +42,29 @@ reserved and cannot be used as an identifier:
      - int8
      - int16
      - int32
+     - int8be
+     - int16be
+   * - int32be
      - matches
      - meta
-   * - nocase
+     - nocase
      - not
      - or
      - of
-     - private
+   * - private
      - rule
      - strings
-   * - them
+     - them
      - true
      - uint8
      - uint16
-     - uint32
+   * - uint32
+     - uint8be
+     - uint16be
+     - uint32be
      - wide
      -
+     -
 
 Rules are generally composed of two sections: strings definition and condition.
 The strings definition section can be omitted if the rule doesn't rely on any
@@ -509,7 +516,7 @@ In the majority of cases, when a string identifier is used in a condition, we
 are willing to know if the associated string is anywhere within the file or
 process memory, but sometimes we need to know if the string is at some specific
 offset on the file or at some virtual address within the process address space.
-In such situations the operator ``at` is what we need. This operator is used as
+In such situations the operator ``at`` is what we need. This operator is used as
 shown in the following example::
 
     rule AtExample
@@ -623,22 +630,29 @@ Accessing data at a given position
 There are many situations in which you may want to write conditions that
 depends on data stored at a certain file offset or memory virtual address,
 depending if we are scanning a file or a running process. In those situations
-you can use one of the following functions to read from the file at the given
-offset::
+you can use one of the following functions to read data from the file at the given offset::
 
     int8(<offset or virtual address>)
     int16(<offset or virtual address>)
     int32(<offset or virtual address>)
+
     uint8(<offset or virtual address>)
     uint16(<offset or virtual address>)
     uint32(<offset or virtual address>)
 
+    int8be(<offset or virtual address>)
+    int16be(<offset or virtual address>)
+    int32be(<offset or virtual address>)
+
+    uint8be(<offset or virtual address>)
+    uint16be(<offset or virtual address>)
+    uint32be(<offset or virtual address>)
+
 The ``intXX`` functions read 8, 16, and 32 bits signed integers from
 <offset or virtual address>, while functions ``uintXX`` read unsigned integers.
-Both 16 and 32 bits integer are considered to be little-endian.
-The <offset or virtual address> parameter can be any expression returning
-an unsigned integer, including the return value of one the ``uintXX`` functions
-itself. As an example let's see a rule to distinguish PE files::
+Both 16 and 32 bits integer are considered to be little-endian. If you
+want to read a big-endian integer use the corresponding function ending
+in ``be``. The <offset or virtual address> parameter can be any expression returning an unsigned integer, including the return value of one the ``uintXX`` functions itself. As an example let's see a rule to distinguish PE files::
 
     rule IsPE
     {

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/forensics/yara.git



More information about the forensics-changes mailing list