[Forensics-changes] [yara] 06/192: Add support for CLI parsing. (#356)

Hilko Bengen bengen at moszumanska.debian.org
Sat Jul 1 10:31:40 UTC 2017


This is an automated email from the git hooks/post-receive script.

bengen pushed a commit to annotated tag v3.6.0
in repository yara.

commit 7f1596d5b802567a0b81d95ddd0f6b3ebb234d6a
Author: Wesley Shields <wxs at atarininja.org>
Date:   Wed Aug 3 10:58:08 2016 -0400

    Add support for CLI parsing. (#356)
    
    * Add support for CLI parsing.
    
    This commit adds support for parsing the COM Runtime descriptor
    directory entry, which is only the start of the rabbit hole. ;)
    
    Ultimately this leads us to parsing the CLI MetaData streams, including
    GUIDs and a convoluted table structure containing various information
    relevant to the CLI.
    
    In particular, here are the things we parse out:
    
    +  declare_string("dotnet_version");
    +  declare_string("dotnet_module_name");
    +  begin_struct_array("dotnet_streams");
    +    declare_string("name");
    +    declare_integer("offset");
    +    declare_integer("size");
    +  end_struct_array("dotnet_streams");
    +  declare_integer("number_of_dotnet_streams");
    +  declare_string_array("dotnet_guids");
    +  begin_struct_array("dotnet_resources");
    +    declare_integer("offset");
    +    declare_integer("length");
    +    declare_string("name");
    +  end_struct_array("dotnet_resources");
    +  declare_integer("number_of_dotnet_resources");
    +  begin_struct("dotnet_assembly");
    +    begin_struct("version");
    +      declare_integer("major");
    +      declare_integer("minor");
    +      declare_integer("build_number");
    +      declare_integer("revision_number");
    +    end_struct("version");
    +    declare_string("name");
    +    declare_string("culture");
    +  end_struct("dotnet_assembly");
    +  declare_string_array("dotnet_modulerefs");
    +  declare_integer("number_of_dotnet_modulerefs");
    
    Unfortunately there are a lot of manual table walking because the tables
    contain variable size columns, which depend upon the number of rows in
    other tables. This makes the table walking code more messy than I would
    like.
    
    And since I'm here and noticed it was unused, remove the unused
    RICH_DATA structure.
    
    * Add missing array counter for guids.
    
    * Improve handling obfuscated binaries.
    
    Add support for ENCLOG and ENCMAP entries, which I finally found in the
    wild (b7b5312393d08576f66f632d447a83a12a842d5a1311efe53aa807990ecaf91a).
    
    Make the bounds checking in pe_get_dotnet_string() more robust by
    properly calculating the string offset.
    
    Remove string_size variable. It was never used.
    
    Catch cases where there are multiple important streams (#Strings and #~)
    which appears to be an obfuscation technique. We will always assume the
    first one is the correct one since that is what the official runtime
    seems to do.
    
    When setting array counters be sure to set an accurate value.
    
    * Create dotnet module.
    
    Split the dotnet functionality off from the pe module and put it in a
    separate module called dotnet.
    
    The pieces which are specific only to the pe module now live in
    yara/pe.h. These include IMPORTED_DLL and IMPORTED_FUNCTION.
    
    The pieces which are specific only to the dotnet module now live in
    yara/dotnet.h. These include CLI_HEADER, NET_METADATA, etc.
    
    The common pieces now live in pe_utils.c and the declarations and
    necessary macros live in yara/pe_utils.h
    
    The dotnet module uses the PE structure defined in pe.h, even though all
    of it isn't necessary.
    
    Add include guard around yara/pe.h.
    
    * Refactor dotnet module.
    
    Break this module into a number of smaller functions. In particular
    there is now:
    
    dotnet_parse_com(): Parse the CLI header.
    
    dotnet_parse_stream_headers(): Parsing the stream headers, which are the
    variable length array as part of the CLI header. This function also
    stores the various offsets into relevant streams like GUID, tilde and
    strings streams.
    
    dotnet_parse_guid(): Parse the GUID stream.
    
    dotnet_parse_tilde(): Take the first pass through the #~ stream,
    collecting relevant information like number of rows per table that can
    be used as part of a coded index. Call dotnet_parse_tilde_2() at the
    end.
    
    dotnet_parse_tilde_2(): Take the second pass through the #~ stream. This
    is the function which is responsible for walking the individual tables
    and pulling out relevant information like assembly, resources, etc.
    
    * Fix constant to be consistent.
    
    * Add ROW_CHECK and ROW_CHECK_WITH_INDEX macros.
    
    * Chase changes in 5d6d8b1e.
    
    The changes were made in pe.c but I've moved this definition to pe_utils.h
    in my branch.
    
    * Use memmem(3) instead of strnstr(3).
    
    strnstr(3) is not portable and rather than include it just for this one case
    fake it using memmem(3). If necessary we can add our own implementation of
    memmem(3) and do all the necessary autoconf stuff to detect.
    
    While here, include stdarg.h because it is necessary to suppress a warning
    with gcc about using va_start(3) and friends.
    
    * Implement typelib support in dotnet.
    
    This isn't pretty by any means and can be cleaned up a bit but this is just the
    first attempt at getting them out.
    
    * Merge the RVA calculation fixes.
    
    Because pe_rva_to_offset got moved into pe_utils.c in the dotnet branch all the
    work that has been done to fix RVA to offset calculations was not being merged
    in here.
    
    * Update to new fetch_data() stuff.
---
 libyara/Makefile.am             |    4 +
 libyara/include/yara/dotnet.h   |  277 +++++++++
 libyara/include/yara/pe.h       |   61 +-
 libyara/include/yara/pe_utils.h |   30 +
 libyara/modules/dotnet.c        | 1268 +++++++++++++++++++++++++++++++++++++++
 libyara/modules/module_list     |    1 +
 libyara/modules/pe.c            |  239 +-------
 libyara/modules/pe_utils.c      |  193 +++++-
 8 files changed, 1827 insertions(+), 246 deletions(-)

diff --git a/libyara/Makefile.am b/libyara/Makefile.am
index be389a2..43e8a75 100644
--- a/libyara/Makefile.am
+++ b/libyara/Makefile.am
@@ -3,6 +3,7 @@ MODULES =  modules/tests.c
 MODULES += modules/pe.c
 MODULES += modules/elf.c
 MODULES += modules/math.c
+MODULES += modules/dotnet.c
 
 if CUCKOO_MODULE
 MODULES += modules/cuckoo.c
@@ -16,6 +17,9 @@ if HASH_MODULE
 MODULES += modules/hash.c
 endif
 
+# This isn't really a module, but needs to be compiled with them.
+MODULES += modules/pe_utils.c
+
 #
 # Add your modules here:
 #
diff --git a/libyara/include/yara/dotnet.h b/libyara/include/yara/dotnet.h
new file mode 100644
index 0000000..75aac08
--- /dev/null
+++ b/libyara/include/yara/dotnet.h
@@ -0,0 +1,277 @@
+#ifndef YR_DOTNET_H
+#define YR_DOTNET_H
+
+
+//
+// CLI header.
+// ECMA-335 Section II.25.3.3
+//
+typedef struct _CLI_HEADER {
+    DWORD Size; // Called "Cb" in documentation.
+    WORD MajorRuntimeVersion;
+    WORD MinorRuntimeVersion;
+    IMAGE_DATA_DIRECTORY MetaData;
+    DWORD Flags;
+    DWORD EntryPointToken;
+    IMAGE_DATA_DIRECTORY Resources;
+    IMAGE_DATA_DIRECTORY StrongNameSignature;
+    ULONGLONG CodeManagerTable;
+    IMAGE_DATA_DIRECTORY VTableFixups;
+    ULONGLONG ExportAddressTableJumps;
+    ULONGLONG ManagedNativeHeader;
+} CLI_HEADER, *PCLI_HEADER;
+
+#define NET_METADATA_MAGIC 0x424a5342
+
+//
+// CLI MetaData
+// ECMA-335 Section II.24.2.1
+//
+// Note: This is only part of the struct, as the rest of it is variable length.
+//
+typedef struct _NET_METADATA {
+    DWORD Magic;
+    WORD MajorVersion;
+    WORD MinorVersion;
+    DWORD Reserved;
+    DWORD Length;
+    char Version[0];
+} NET_METADATA, *PNET_METADATA;
+
+#define DOTNET_STREAM_NAME_SIZE 32
+
+//
+// CLI Stream Header
+// ECMA-335 Section II.24.2.2
+//
+typedef struct _STREAM_HEADER {
+    DWORD Offset;
+    DWORD Size;
+    char Name[0];
+} STREAM_HEADER, *PSTREAM_HEADER;
+
+
+//
+// CLI #~ Stream Header
+// ECMA-335 Section II.24.2.6
+//
+typedef struct _TILDE_HEADER {
+    DWORD Reserved1;
+    BYTE MajorVersion;
+    BYTE MinorVersion;
+    BYTE HeapSizes;
+    BYTE Reserved2;
+    ULONGLONG Valid;
+    ULONGLONG Sorted;
+} TILDE_HEADER, *PTILDE_HEADER;
+
+// These are the bit positions in Valid which will be set if the table
+// exists.
+#define BIT_MODULE                   0x00
+#define BIT_TYPEREF                  0x01
+#define BIT_TYPEDEF                  0x02
+#define BIT_FIELDPTR                 0x03 // Not documented in ECMA-335
+#define BIT_FIELD                    0x04
+#define BIT_METHODDEFPTR             0x05 // Not documented in ECMA-335
+#define BIT_METHODDEF                0x06
+#define BIT_PARAMPTR                 0x07 // Not documented in ECMA-335
+#define BIT_PARAM                    0x08
+#define BIT_INTERFACEIMPL            0x09
+#define BIT_MEMBERREF                0x0A
+#define BIT_CONSTANT                 0x0B
+#define BIT_CUSTOMATTRIBUTE          0x0C
+#define BIT_FIELDMARSHAL             0x0D
+#define BIT_DECLSECURITY             0x0E
+#define BIT_CLASSLAYOUT              0x0F
+#define BIT_FIELDLAYOUT              0x10
+#define BIT_STANDALONESIG            0x11
+#define BIT_EVENTMAP                 0x12
+#define BIT_EVENTPTR                 0x13 // Not documented in ECMA-335
+#define BIT_EVENT                    0x14
+#define BIT_PROPERTYMAP              0x15
+#define BIT_PROPERTYPTR              0x16 // Not documented in ECMA-335
+#define BIT_PROPERTY                 0x17
+#define BIT_METHODSEMANTICS          0x18
+#define BIT_METHODIMPL               0x19
+#define BIT_MODULEREF                0x1A
+#define BIT_TYPESPEC                 0x1B
+#define BIT_IMPLMAP                  0x1C
+#define BIT_FIELDRVA                 0x1D
+#define BIT_ENCLOG                   0x1E // Not documented in ECMA-335
+#define BIT_ENCMAP                   0x1F // Not documented in ECMA-335
+#define BIT_ASSEMBLY                 0x20
+#define BIT_ASSEMBLYPROCESSOR        0x21
+#define BIT_ASSEMBLYOS               0x22
+#define BIT_ASSEMBLYREF              0x23
+#define BIT_ASSEMBLYREFPROCESSOR     0x24
+#define BIT_ASSEMBLYREFOS            0x25
+#define BIT_FILE                     0x26
+#define BIT_EXPORTEDTYPE             0x27
+#define BIT_MANIFESTRESOURCE         0x28
+#define BIT_NESTEDCLASS              0x29
+#define BIT_GENERICPARAM             0x2A
+#define BIT_METHODSPEC               0x2B
+#define BIT_GENERICPARAMCONSTRAINT   0x2C
+// These are not documented in ECMA-335 nor is it clear what the format is.
+// They are for debugging information as far as I can tell.
+//#define BIT_DOCUMENT               0x30
+//#define BIT_METHODDEBUGINFORMATION 0x31
+//#define BIT_LOCALSCOPE             0x32
+//#define BIT_LOCALVARIABLE          0x33
+//#define BIT_LOCALCONSTANT          0x34
+//#define BIT_IMPORTSCOPE            0x35
+//#define BIT_STATEMACHINEMETHOD     0x36
+
+// The string length of a typelib attribute is at most 0xFF.
+#define MAX_TYPELIB_SIZE 0xFF
+
+//
+// Module table
+// ECMA-335 Section II.22.30
+//
+typedef struct _MODULE_TABLE {
+    WORD Generation;
+    union {
+        WORD Name_Short;
+        DWORD Name_Long;
+    } Name;
+    union {
+        WORD Mvid_Short;
+        DWORD Mvid_Long;
+    } Mvid;
+    union {
+        WORD EncId_Short;
+        DWORD EncId_Long;
+    } EncId;
+    union {
+        WORD EncBaseId_Short;
+        DWORD EncBaseId_Long;
+    } EncBaseId;
+} MODULE_TABLE, *PMODULE_TABLE;
+
+//
+// Assembly Table
+// ECMA-335 Section II.22.2
+//
+typedef struct _ASSEMBLY_TABLE {
+    DWORD HashAlgId;
+    WORD MajorVersion;
+    WORD MinorVersion;
+    WORD BuildNumber;
+    WORD RevisionNumber;
+    DWORD Flags;
+    union {
+        WORD PublicKey_Short;
+        DWORD PublicKey_Long;
+    } PublicKey;
+    union {
+        WORD Name_Short;
+        DWORD Name_Long;
+    } Name;
+} ASSEMBLY_TABLE, *PASSEMBLY_TABLE;
+
+//
+// Manifest Resource Table
+// ECMA-335 Section II.22.24
+//
+typedef struct _MANIFESTRESOURCE_TABLE {
+    DWORD Offset;
+    DWORD Flags;
+    union {
+        WORD Name_Short;
+        DWORD Name_Long;
+    } Name;
+    union {
+        WORD Implementation_Short;
+        DWORD Implementation_Long;
+    } Implementation;
+} MANIFESTRESOURCE_TABLE, *PMANIFESTRESOURCE_TABLE;
+
+//
+// ModuleRef Table
+// ECMA-335 Section II.22.31
+//
+// This is a short table, but necessary because the field size can change.
+//
+typedef struct _MODULEREF_TABLE {
+  union {
+      WORD Name_Short;
+      DWORD Name_Long;
+  } Name;
+} MODULEREF_TABLE, *PMODULEREF_TABLE;
+
+
+//
+// CustomAttribute Table
+// ECMA-335 Section II.22.10
+//
+typedef struct _CUSTOMATTRIBUTE_TABLE {
+  union {
+    WORD Parent_Short;
+    DWORD Parent_Long;
+  } Parent;
+  union {
+    WORD Type_Short;
+    DWORD Type_Long;
+  } Type;
+  union {
+    WORD Value_Short;
+    DWORD Value_Long;
+  } Value;
+} CUSTOMATTRIBUTE_TABLE, *PCUSTOMATTRIBUTE_TABLE;
+
+
+// Used to return offsets to the various headers.
+typedef struct _STREAMS {
+    PSTREAM_HEADER guid;
+    PSTREAM_HEADER tilde;
+    PSTREAM_HEADER string;
+    PSTREAM_HEADER blob;
+} STREAMS, *PSTREAMS;
+
+
+// Used to store the number of rows of each table.
+typedef struct _ROWS {
+    uint32_t module;
+    uint32_t moduleref;
+    uint32_t assemblyref;
+    uint32_t typeref;
+    uint32_t methoddef;
+    uint32_t memberref;
+    uint32_t typedef_;
+    uint32_t typespec;
+    uint32_t field;
+    uint32_t param;
+    uint32_t property;
+    uint32_t interfaceimpl;
+    uint32_t event;
+    uint32_t standalonesig;
+    uint32_t assembly;
+    uint32_t file;
+    uint32_t exportedtype;
+    uint32_t manifestresource;
+    uint32_t genericparam;
+    uint32_t genericparamconstraint;
+    uint32_t methodspec;
+    uint32_t assemblyrefprocessor;
+} ROWS, *PROWS;
+
+
+// Used to store the index sizes for the various tables.
+typedef struct _INDEX_SIZES {
+    uint8_t string;
+    uint8_t guid;
+    uint8_t blob;
+    uint8_t field;
+    uint8_t methoddef;
+    uint8_t memberref;
+    uint8_t param;
+    uint8_t event;
+    uint8_t typedef_;
+    uint8_t property;
+    uint8_t moduleref;
+    uint8_t assemblyrefprocessor;
+    uint8_t assemblyref;
+    uint8_t genericparam;
+} INDEX_SIZES, *PINDEX_SIZES;
+#endif
diff --git a/libyara/include/yara/pe.h b/libyara/include/yara/pe.h
index 9afa314..d03e91a 100644
--- a/libyara/include/yara/pe.h
+++ b/libyara/include/yara/pe.h
@@ -27,6 +27,11 @@ ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */
 
+#ifndef YR_PE_H
+#define YR_PE_H
+
+#include <yara/types.h>
+
 #pragma pack(push, 1)
 
 #if defined(_WIN32) || defined(__CYGWIN__)
@@ -285,6 +290,11 @@ typedef struct _IMAGE_OPTIONAL_HEADER64 {
 #define IMAGE_NT_OPTIONAL_HDR32_MAGIC      0x10b
 #define IMAGE_NT_OPTIONAL_HDR64_MAGIC      0x20b
 
+#define OptionalHeader(pe,field)                \
+  (IS_64BITS_PE(pe) ?                           \
+   pe->header64->OptionalHeader.field :         \
+   pe->header->OptionalHeader.field)
+
 
 typedef struct _IMAGE_NT_HEADERS32 {
     DWORD Signature;
@@ -302,6 +312,50 @@ typedef struct _IMAGE_NT_HEADERS64 {
 } IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;
 
 
+//
+// Imports are stored in a linked list. Each node (IMPORTED_DLL) contains the
+// name of the DLL and a pointer to another linked list of IMPORTED_FUNCTION
+// structures containing the names of imported functions.
+//
+
+typedef struct _IMPORTED_DLL
+{
+  char *name;
+
+  struct _IMPORTED_FUNCTION *functions;
+  struct _IMPORTED_DLL *next;
+
+} IMPORTED_DLL, *PIMPORTED_DLL;
+
+
+typedef struct _IMPORTED_FUNCTION
+{
+  char *name;
+  uint8_t has_ordinal;
+  uint16_t ordinal;
+
+  struct _IMPORTED_FUNCTION *next;
+
+} IMPORTED_FUNCTION, *PIMPORTED_FUNCTION;
+
+
+typedef struct _PE
+{
+  uint8_t* data;
+  size_t data_size;
+
+  union {
+    PIMAGE_NT_HEADERS32 header;
+    PIMAGE_NT_HEADERS64 header64;
+  };
+
+  YR_OBJECT* object;
+  IMPORTED_DLL* imported_dlls;
+  uint32_t resources;
+
+} PE;
+
+
 // IMAGE_FIRST_SECTION doesn't need 32/64 versions since the file header is
 // the same either way.
 
@@ -481,10 +535,5 @@ typedef struct _RICH_SIGNATURE {
 #define RICH_DANS 0x536e6144 // "DanS"
 #define RICH_RICH 0x68636952 // "Rich"
 
-typedef struct _RICH_DATA {
-    size_t len;
-    BYTE* raw_data;
-    BYTE* clear_data;
-} RICH_DATA, *PRICH_DATA;
-
 #pragma pack(pop)
+#endif
diff --git a/libyara/include/yara/pe_utils.h b/libyara/include/yara/pe_utils.h
new file mode 100644
index 0000000..945d843
--- /dev/null
+++ b/libyara/include/yara/pe_utils.h
@@ -0,0 +1,30 @@
+#ifndef YR_PE_UTILS_H
+#define YR_PE_UTILS_H
+
+#include <yara/pe.h>
+
+#define MAX_PE_SECTIONS              96
+
+#define IS_64BITS_PE(pe) \
+    (pe->header64->OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR64_MAGIC)
+
+#define fits_in_pe(pe, pointer, size) \
+    ((size_t) size <= pe->data_size && \
+     (uint8_t*) (pointer) >= pe->data && \
+     (uint8_t*) (pointer) <= pe->data + pe->data_size - size)
+
+#define struct_fits_in_pe(pe, pointer, struct_type) \
+    fits_in_pe(pe, pointer, sizeof(struct_type))
+
+PIMAGE_NT_HEADERS32 pe_get_header(uint8_t* data, size_t data_size);
+PIMAGE_DATA_DIRECTORY pe_get_directory_entry(PE* pe, int entry);
+PIMAGE_DATA_DIRECTORY pe_get_directory_entry(PE* pe, int entry);
+int64_t pe_rva_to_offset(PE* pe, uint64_t rva);
+char *ord_lookup(char *dll, uint16_t ord);
+
+#if HAVE_LIBCRYPTO
+#include <openssl/asn1.h>
+time_t ASN1_get_time_t(ASN1_TIME* time);
+#endif
+
+#endif
diff --git a/libyara/modules/dotnet.c b/libyara/modules/dotnet.c
new file mode 100644
index 0000000..8eb8dbf
--- /dev/null
+++ b/libyara/modules/dotnet.c
@@ -0,0 +1,1268 @@
+/*
+Copyright (c) 2015. The YARA Authors. All Rights Reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+*/
+
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdarg.h>
+#include <ctype.h>
+#include <time.h>
+#include <config.h>
+
+#include <yara/pe.h>
+#include <yara/dotnet.h>
+#include <yara/modules.h>
+#include <yara/mem.h>
+
+#include <yara/pe_utils.h>
+
+#define MODULE_NAME dotnet
+
+
+char* pe_get_dotnet_string(
+    PE* pe,
+    uint8_t* string_offset,
+    DWORD string_index)
+{
+  size_t remaining;
+  char* start;
+  char* eos;
+
+  // Start of string must be within boundary
+  if (!(string_offset + string_index >= pe->data &&
+        string_offset + string_index < pe->data + pe->data_size))
+    return NULL;
+
+  // Calculate how much until end of boundary, don't scan past that.
+  remaining = (pe->data + pe->data_size) - (string_offset + string_index);
+
+  // Search for a NULL terminator from start of string, up to remaining.
+  start = (char*) (string_offset + string_index);
+  eos = (char*) memmem((void*) start, remaining, "\0", 1);
+  if (eos == NULL)
+    return eos;
+
+  return start;
+}
+
+
+uint32_t max_rows(int count, ...)
+{
+  va_list ap;
+  int i;
+  uint32_t biggest;
+  uint32_t x;
+
+  if (count == 0)
+    return 0;
+
+  va_start(ap, count);
+  biggest = va_arg(ap, uint32_t);
+  for (i = 1; i < count; i++)
+  {
+    x = va_arg(ap, uint32_t);
+    biggest = (x > biggest) ? x : biggest;
+  }
+
+  va_end(ap);
+  return biggest;
+}
+
+
+void dotnet_parse_guid(
+    PE* pe,
+    int64_t metadata_root,
+    PSTREAM_HEADER guid_header)
+{
+  // GUIDs are 16 bytes each, converted to hex format plus separators and NULL.
+  char guid[37];
+  int i = 0;
+
+  uint8_t* guid_offset = pe->data + metadata_root + guid_header->Offset;
+  DWORD guid_size = guid_header->Size;
+
+  // Parse GUIDs if we have them.
+  // GUIDs are 16 bytes each.
+  while (guid_size >= 16 && fits_in_pe(pe, guid_offset, 16))
+  {
+    sprintf(guid, "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x",
+        *(uint32_t*) guid_offset,
+        *(uint16_t*) (guid_offset + 4),
+        *(uint16_t*) (guid_offset + 6),
+        *(guid_offset + 8),
+        *(guid_offset + 9),
+        *(guid_offset + 10),
+        *(guid_offset + 11),
+        *(guid_offset + 12),
+        *(guid_offset + 13),
+        *(guid_offset + 14),
+        *(guid_offset + 15));
+    guid[(16 * 2) + 4] = '\0';
+
+    set_string(guid, pe->object, "guids[%i]", i);
+
+    i++;
+    guid_size -= 16;
+  }
+
+  set_integer(i, pe->object, "number_of_guids");
+}
+
+
+STREAMS dotnet_parse_stream_headers(
+    PE* pe,
+    int64_t offset,
+    int64_t metadata_root,
+    DWORD num_streams)
+{
+  int i;
+  STREAMS headers;
+  char stream_name[DOTNET_STREAM_NAME_SIZE + 1];
+  PSTREAM_HEADER stream_header;
+
+  memset(&headers, '\0', sizeof(STREAMS));
+
+  stream_header = (PSTREAM_HEADER) (pe->data + offset);
+
+  for (i = 0; i < num_streams; i++)
+  {
+    if (!struct_fits_in_pe(pe, stream_header, STREAM_HEADER))
+      break;
+
+    strncpy(stream_name, stream_header->Name, DOTNET_STREAM_NAME_SIZE);
+    stream_name[DOTNET_STREAM_NAME_SIZE] = '\0';
+
+    set_string(stream_name,
+        pe->object, "streams[%i].name", i);
+    // Offset is relative to metadata_root.
+    set_integer(metadata_root + stream_header->Offset,
+        pe->object, "streams[%i].offset", i);
+    set_integer(stream_header->Size,
+        pe->object, "streams[%i].size", i);
+
+    // Store necessary bits to parse these later. Not all tables will be
+    // parsed, but are referenced from others. For example, the #Strings
+    // stream is referenced from various tables in the #~ heap.
+    if (strncmp(stream_name, "#GUID", 5) == 0)
+      headers.guid = stream_header;
+    // Believe it or not, I have seen at least one binary which has a #- stream
+    // instead of a #~ (215e1b54ae1aac153e55596e6f1a4350). This isn't in the
+    // documentation anywhere but the structure is the same. I'm chosing not
+    // to parse it for now.
+    else if (strncmp(stream_name, "#~", 2) == 0 && headers.tilde == NULL)
+      headers.tilde = stream_header;
+    else if (strncmp(stream_name, "#Strings", 8) == 0 && headers.string == NULL)
+      headers.string = stream_header;
+    else if (strncmp(stream_name, "#Blob", 5) == 0)
+      headers.blob = stream_header;
+
+    // Stream name is padded to a multiple of 4.
+    stream_header = (PSTREAM_HEADER) ((uint8_t*) stream_header +
+        sizeof(STREAM_HEADER) +
+        strlen(stream_name) +
+        4 - (strlen(stream_name) % 4));
+  }
+
+  set_integer(i, pe->object, "number_of_streams");
+
+  return headers;
+}
+
+
+// This is the second pass through the data for #~. The first pass collects
+// information on the number of rows for tables which have coded indexes.
+// This pass uses that information and the index_sizes to parse the tables
+// of interest.
+//
+// Because the indexes can vary in size depending upon the number of rows in
+// other tables it is impossible to use static sized structures. To deal with
+// this hardcode the sizes of each table based upon the documentation (for the
+// static sized portions) and use the variable sizes accordingly.
+void dotnet_parse_tilde_2(
+    PE* pe,
+    PTILDE_HEADER tilde_header,
+    int64_t resource_base,
+    int64_t metadata_root,
+    ROWS rows,
+    INDEX_SIZES index_sizes,
+    PSTREAMS streams)
+{
+  PMODULE_TABLE module_table;
+  PASSEMBLY_TABLE assembly_table;
+  PMANIFESTRESOURCE_TABLE manifestresource_table;
+  PMODULEREF_TABLE moduleref_table;
+  PCUSTOMATTRIBUTE_TABLE customattribute_table;
+  DWORD resource_size, implementation;
+  char *name;
+  char typelib[MAX_TYPELIB_SIZE + 1];
+  int i, bit_check;
+  int64_t resource_offset;
+  uint32_t row_size, row_count, counter;
+  uint8_t* string_offset;
+  uint8_t* blob_offset;
+  int matched_bits = 0;
+  uint32_t num_rows = 0;
+  uint32_t valid_rows = 0;
+  uint32_t* row_offset = NULL;
+  uint8_t* table_offset = NULL;
+  uint8_t* row_ptr = NULL;
+  // These are pointers and row sizes for tables of interest to us for special
+  // parsing. For example, we are interested in pulling out any CustomAttributes
+  // that are GUIDs so we need to be able to walk these tables. To find GUID
+  // CustomAttributes you need to walk the CustomAttribute table and look for
+  // any row with a Parent that indexes into the Assembly table and Type indexes
+  // into the MemberRef table. Then you follow the index into the MemberRef
+  // table and check the Class to make sure it indexes into TypeRef table. If it
+  // does you follow that index and make sure the Name is "GuidAttribute". If
+  // all that is valid then you can take the Value from the CustomAttribute
+  // table to find out the index into the Blob stream and parse that.
+  //
+  // Luckily we can abuse the fact that the order of the tables is guranteed
+  // consistent (though some may not exist, but if they do exist they must exist
+  // in a certain order). The order is defined by their position in the Valid
+  // member of the tilde_header structure. By the time we are parsing the
+  // CustomAttribute table we have already recorded the location of the TypeRef
+  // and MemberRef tables, so we can follow the chain back up from
+  // CustomAttribute through MemberRef to TypeRef.
+  uint8_t* typeref_ptr = NULL;
+  uint8_t* memberref_ptr = NULL;
+  uint32_t typeref_row_size = 0;
+  uint32_t memberref_row_size = 0;
+  uint8_t* typeref_row = NULL;
+  uint8_t* memberref_row = NULL;
+  DWORD type_index;
+  DWORD class_index;
+  DWORD blob_index;
+  DWORD blob_length;
+  // These are used to determine the size of coded indexes, which are the
+  // dynamically sized columns for some tables. The coded indexes are
+  // documented in ECMA-335 Section II.24.2.6.
+  uint8_t index_size, index_size2;
+
+  // Number of rows is the number of bits set to 1 in Valid.
+  // Should use this technique:
+  // http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
+  for (i = 0; i < 64; i++)
+    valid_rows += ((tilde_header->Valid >> i) & 0x01);
+
+  row_offset = (uint32_t*) (tilde_header + 1);
+  table_offset = (uint8_t*) row_offset;
+  table_offset += sizeof(uint32_t) * valid_rows;
+
+#define DOTNET_STRING_INDEX(Name) \
+  index_sizes.string == 2 ? Name.Name_Short : Name.Name_Long
+
+  string_offset = pe->data + metadata_root + streams->string->Offset;
+
+  // Now walk again this time parsing out what we care about.
+  for (bit_check = 0; bit_check < 64; bit_check++)
+  {
+    // If the Valid bit is not set for this table, skip it...
+    if (!((tilde_header->Valid >> bit_check) & 0x01))
+      continue;
+
+    // Make sure table_offset doesn't go crazy by inserting a large value
+    // for num_rows. For example edc05e49dd3810be67942b983455fd43 sets a
+    // large value for number of rows for the BIT_MODULE section.
+    if (!fits_in_pe(pe, table_offset, 1))
+      return;
+
+    num_rows = *(row_offset + matched_bits);
+
+    // Those tables which exist, but that we don't care about must be
+    // skipped.
+    //
+    // Sadly, given the dynamic sizes of some columns we can not have well
+    // defined structures for all tables and use them accordingly. To deal
+    // with this manually move the table_offset pointer by the appropriate
+    // number of bytes as described in the documentation for each table.
+    //
+    // The table structures are documented in ECMA-335 Section II.22.
+    switch (bit_check)
+    {
+      case BIT_MODULE:
+        module_table = (PMODULE_TABLE) table_offset;
+        name = pe_get_dotnet_string(pe,
+            string_offset,
+            DOTNET_STRING_INDEX(module_table->Name));
+        if (name != NULL)
+          set_string(name, pe->object, "module_name");
+
+        table_offset += (2 + index_sizes.string + (index_sizes.guid * 3)) * num_rows;
+        break;
+      case BIT_TYPEREF:
+        row_count = max_rows(4, rows.module, rows.moduleref, rows.assemblyref,
+            rows.typeref);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        row_size = (index_size + (index_sizes.string * 2));
+        typeref_row_size = row_size;
+        typeref_ptr = table_offset;
+        table_offset += row_size * num_rows;
+        break;
+      case BIT_TYPEDEF:
+        row_count = max_rows(3, rows.typedef_, rows.typeref, rows.typespec);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (4 + (index_sizes.string * 2) + index_size + index_sizes.field + index_sizes.methoddef) * num_rows;
+        break;
+      case BIT_FIELDPTR:
+        // This one is not documented in ECMA-335.
+        table_offset += (index_sizes.field) * num_rows;
+        break;
+      case BIT_FIELD:
+        table_offset += (2 + (index_sizes.string) + index_sizes.blob) * num_rows;
+        break;
+      case BIT_METHODDEFPTR:
+        // This one is not documented in ECMA-335.
+        table_offset += (index_sizes.methoddef) * num_rows;
+        break;
+      case BIT_METHODDEF:
+        table_offset += (4 + 2 + 2 + index_sizes.string + index_sizes.blob + index_sizes.param) * num_rows;
+        break;
+      case BIT_PARAM:
+        table_offset += (2 + 2 + index_sizes.string) * num_rows;
+        break;
+      case BIT_INTERFACEIMPL:
+        row_count = max_rows(3, rows.typedef_, rows.typeref, rows.typespec);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (index_sizes.typedef_ + index_size) * num_rows;
+        break;
+      case BIT_MEMBERREF:
+        row_count = max_rows(4, rows.methoddef, rows.moduleref, rows.typeref,
+            rows.typespec);
+
+        if (row_count > (0xFFFF >> 0x03))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        row_size = (index_size + index_sizes.string + index_sizes.blob);
+        memberref_row_size = row_size;
+        memberref_ptr = table_offset;
+        table_offset += row_size * num_rows;
+        break;
+      case BIT_CONSTANT:
+        row_count = max_rows(3, rows.param, rows.field, rows.property);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (1 + 1 + index_size + index_sizes.blob) * num_rows;
+        break;
+      case BIT_CUSTOMATTRIBUTE:
+        // index_size is size of the parent column.
+        row_count = max_rows(21, rows.methoddef, rows.field, rows.typeref,
+            rows.typedef_, rows.param, rows.interfaceimpl, rows.memberref,
+            rows.module, rows.property, rows.event, rows.standalonesig,
+            rows.moduleref, rows.typespec, rows.assembly, rows.assemblyref,
+            rows.file, rows.exportedtype, rows.manifestresource,
+            rows.genericparam, rows.genericparamconstraint, rows.methodspec);
+
+        if (row_count > (0xFFFF >> 0x05))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        // index_size2 is size of the type column.
+        row_count = max_rows(2, rows.methoddef, rows.memberref);
+
+        if (row_count > (0xFFFF >> 0x03))
+          index_size2 = 4;
+        else
+          index_size2 = 2;
+
+        row_size = (index_size + index_size2 + index_sizes.blob);
+        if (typeref_ptr != NULL && memberref_ptr != NULL)
+        {
+          row_ptr = table_offset;
+          for (i = 0; i < num_rows; i++)
+          {
+            if (!fits_in_pe(pe, row_ptr, row_size))
+              break;
+
+            // Check the Parent field.
+            customattribute_table = (PCUSTOMATTRIBUTE_TABLE) row_ptr;
+            if (index_size == 4)
+            {
+              // Low 5 bits tell us what this is an index into. Remaining bits
+              // tell us the index value.
+              // Parent must be an index into the Assembly (0x0E) table.
+              if ((*(DWORD*) customattribute_table & 0x1F) != 0x0E)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+            }
+            else
+            {
+              // Low 5 bits tell us what this is an index into. Remaining bits
+              // tell us the index value.
+              // Parent must be an index into the Assembly (0x0E) table.
+              if ((*(WORD*) customattribute_table & 0x1F) != 0x0E)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+            }
+
+            // Check the Type field.
+            customattribute_table = (PCUSTOMATTRIBUTE_TABLE) (row_ptr + index_size);
+            if (index_size2 == 4)
+            {
+              // Low 3 bits tell us what this is an index into. Remaining bits
+              // tell us the index value. Only values 2 and 3 are defined.
+              // Type must be an index into the MemberRef table.
+              if ((*(DWORD*) customattribute_table & 0x07) != 0x03)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+
+              type_index = *(DWORD*) customattribute_table >> 3;
+            }
+            else
+            {
+              // Low 3 bits tell us what this is an index into. Remaining bits
+              // tell us the index value. Only values 2 and 3 are defined.
+              // Type must be an index into the MemberRef table.
+              if ((*(WORD*) customattribute_table & 0x07) != 0x03)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+
+              // Cast the index to a 32bit value.
+              type_index = (DWORD) ((*(WORD*) customattribute_table >> 3));
+            }
+
+            if (type_index > 0)
+              type_index--;
+            // Now follow the Type index into the MemberRef table.
+            memberref_row = memberref_ptr + (memberref_row_size * type_index);
+            if (index_sizes.memberref == 4)
+            {
+              // Low 3 bits tell us what this is an index into. Remaining bits
+              // tell us the index value. Class must be an index into the
+              // TypeRef table.
+              if ((*(DWORD*) memberref_row & 0x07) != 0x01)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+
+              class_index = *(DWORD*) memberref_row >> 3;
+            }
+            else
+            {
+              // Low 3 bits tell us what this is an index into. Remaining bits
+              // tell us the index value. Class must be an index into the
+              // TypeRef table.
+              if ((*(WORD*) memberref_row & 0x07) != 0x01)
+              {
+                row_ptr += row_size;
+                continue;
+              }
+
+              // Cast the index to a 32bit value.
+              class_index = (DWORD) (*(WORD*) memberref_row >> 3);
+            }
+
+            if (class_index > 0)
+              class_index--;
+            // Now follow the Class index into the TypeRef table.
+            typeref_row = typeref_ptr + (typeref_row_size * class_index);
+            // Skip over the ResolutionScope and check the Name field,
+            // which is an index into the Strings heap.
+            row_count = max_rows(4, rows.module, rows.moduleref,
+                                 rows.assemblyref, rows.typeref);
+            if (row_count > (0xFFFF >> 0x02))
+              typeref_row += 4;
+            else
+              typeref_row += 2;
+
+            if (index_sizes.string == 4)
+              name = pe_get_dotnet_string(pe, string_offset, *(DWORD*) typeref_row);
+            else
+              name = pe_get_dotnet_string(pe, string_offset, *(WORD*) typeref_row);
+
+            if (strncmp(name, "GuidAttribute", 13) != 0)
+            {
+              row_ptr += row_size;
+              continue;
+            }
+
+            // Get the Value field.
+            customattribute_table = (PCUSTOMATTRIBUTE_TABLE) (row_ptr + index_size + index_size2);
+            if (index_sizes.blob == 4)
+              blob_index = *(DWORD*) customattribute_table;
+            else
+              // Cast the value (index into blob table) to a 32bit value.
+              blob_index = (DWORD) (*(WORD*) customattribute_table);
+
+
+            // Everything checks out. Make sure the index into the blob field
+            // is valid (non-null and within range).
+            blob_offset = pe->data + metadata_root + streams->blob->Offset + blob_index;
+
+            // If index into blob is 0 or past the end of the blob stream, skip
+            // it. We don't know the size of the blob entry yet because that is
+            // encoded in the start.
+            if (blob_index == 0x00 || blob_offset >= pe->data + pe->data_size)
+            {
+              row_ptr += row_size;
+              continue;
+            }
+
+            // Blob size is encoded in the first 1, 2 or 4 bytes of the blob.
+            //
+            // If the high bit is not set the length is encoded in one byte.
+            //
+            // If the high 2 bits are 10 (base 2) then the length is encoded in
+            // the rest of the bits and the next byte.
+            //
+            // If the high 3 bits are 110 (base 2) then the length is encoded
+            // in the rest of the bits and the next 3 bytes.
+            //
+            // See ECMA-335 II.24.2.4 for details.
+            if ((*blob_offset & 0x80) == 0x00)
+            {
+              blob_length = (DWORD) *blob_offset;
+              blob_offset++;
+            }
+            else if (blob_offset + 1 < pe->data + pe->data_size &&
+                     (*blob_offset & 0xC0) == 0x80)
+            {
+              blob_length = (DWORD) ((*(WORD*) blob_offset) & 0x3FFF);
+              blob_offset += 2;
+            }
+            else if (blob_offset + 4 < pe->data + pe->data_size &&
+                     (*blob_offset & 0xE0) == 0xC0)
+            {
+              blob_length = (*(DWORD*) blob_offset) & 0x1FFFFFFF;
+              blob_offset += 3;
+            }
+            else
+            {
+              row_ptr += row_size;
+              continue;
+            }
+
+            // Quick sanity check to make sure the blob entry is within bounds.
+            if (blob_offset + blob_length >= pe->data + pe->data_size)
+            {
+              row_ptr += row_size;
+              continue;
+            }
+
+            // Custom attributes MUST have a 16 bit prolog of 0x0001
+            if (*(WORD*) blob_offset != 0x0001)
+            {
+              row_ptr += row_size;
+              continue;
+            }
+
+            // The next byte is the length of the string.
+            blob_offset += 2;
+            if (blob_offset + *blob_offset >= pe->data + pe->data_size)
+            {
+              row_ptr += row_size;
+              continue;
+            }
+            blob_offset += 1;
+            if (*blob_offset == 0xFF || *blob_offset == 0x00)
+            {
+              typelib[0] = '\0';
+            }
+            else
+            {
+              strncpy(typelib, (char*) blob_offset, MAX_TYPELIB_SIZE);
+              typelib[MAX_TYPELIB_SIZE] = '\0';
+            }
+            set_string(typelib, pe->object, "typelib");
+
+            row_ptr += row_size;
+          }
+        }
+
+        table_offset += row_size * num_rows;
+        break;
+      case BIT_FIELDMARSHAL:
+        row_count = max_rows(2, rows.field, rows.param);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (index_size + index_sizes.blob) * num_rows;
+        break;
+      case BIT_DECLSECURITY:
+        row_count = max_rows(3, rows.typedef_, rows.methoddef, rows.assembly);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (2 + index_size + index_sizes.blob) * num_rows;
+        break;
+      case BIT_CLASSLAYOUT:
+        table_offset += (2 + 4 + index_sizes.typedef_) * num_rows;
+        break;
+      case BIT_FIELDLAYOUT:
+        table_offset += (4 + index_sizes.field) * num_rows;
+        break;
+      case BIT_STANDALONESIG:
+        table_offset += (index_sizes.blob) * num_rows;
+        break;
+      case BIT_EVENTMAP:
+        table_offset += (index_sizes.typedef_ + index_sizes.event) * num_rows;
+        break;
+      case BIT_EVENTPTR:
+        // This one is not documented in ECMA-335.
+        table_offset += (index_sizes.event) * num_rows;
+        break;
+      case BIT_EVENT:
+        row_count = max_rows(3, rows.typedef_, rows.typeref, rows.typespec);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (2 + index_sizes.string + index_size) * num_rows;
+        break;
+      case BIT_PROPERTYMAP:
+        table_offset += (index_sizes.typedef_ + index_sizes.property) * num_rows;
+        break;
+      case BIT_PROPERTYPTR:
+        // This one is not documented in ECMA-335.
+        table_offset += (index_sizes.property) * num_rows;
+        break;
+      case BIT_PROPERTY:
+        table_offset += (2 + index_sizes.string + index_sizes.blob) * num_rows;
+        break;
+      case BIT_METHODSEMANTICS:
+        row_count = max_rows(2, rows.event, rows.property);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (2 + index_sizes.methoddef + index_size) * num_rows;
+        break;
+      case BIT_METHODIMPL:
+        row_count = max_rows(2, rows.methoddef, rows.memberref);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (index_sizes.typedef_ + (index_size * 2)) * num_rows;
+        break;
+      case BIT_MODULEREF:
+        row_ptr = table_offset;
+
+        // Can't use 'i' here because we only set the string if it is not
+        // NULL. Instead use 'counter'.
+        counter = 0;
+        for (i = 0; i < num_rows; i++)
+        {
+          moduleref_table = (PMODULEREF_TABLE) row_ptr;
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              DOTNET_STRING_INDEX(moduleref_table->Name));
+          if (name != NULL)
+          {
+            set_string(name, pe->object, "modulerefs[%i]", i);
+            counter++;
+          }
+
+          row_ptr += index_sizes.string;
+        }
+
+        set_integer(counter, pe->object, "number_of_modulerefs");
+
+        table_offset += (index_sizes.string) * num_rows;
+        break;
+      case BIT_TYPESPEC:
+        table_offset += (index_sizes.blob) * num_rows;
+        break;
+      case BIT_IMPLMAP:
+        row_count = max_rows(2, rows.field, rows.methoddef);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (2 + index_size + index_sizes.string + index_sizes.moduleref) * num_rows;
+        break;
+      case BIT_FIELDRVA:
+        table_offset += (4 + index_sizes.field) * num_rows;
+        break;
+      case BIT_ENCLOG:
+        table_offset += (4 + 4) * num_rows;
+        break;
+      case BIT_ENCMAP:
+        table_offset += (4) * num_rows;
+        break;
+      case BIT_ASSEMBLY:
+        row_size = (4 + 2 + 2 + 2 + 2 + 4 + index_sizes.blob + (index_sizes.string * 2));
+        if (!fits_in_pe(pe, table_offset, row_size))
+          break;
+
+        row_ptr = table_offset;
+        assembly_table = (PASSEMBLY_TABLE) table_offset;
+
+        set_integer(assembly_table->MajorVersion,
+            pe->object, "assembly.version.major");
+        set_integer(assembly_table->MinorVersion,
+            pe->object, "assembly.version.minor");
+        set_integer(assembly_table->BuildNumber,
+            pe->object, "assembly.version.build_number");
+        set_integer(assembly_table->RevisionNumber,
+            pe->object, "assembly.version.revision_number");
+
+        // Can't use assembly_table here because the PublicKey comes before
+        // Name and is a variable length field.
+        if (index_sizes.string == 4)
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              *(DWORD*) (row_ptr + 4 + 2 + 2 + 2 + 2 + 4 + index_sizes.blob));
+        else
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              *(WORD*) (row_ptr + 4 + 2 + 2 + 2 + 2 + 4 + index_sizes.blob));
+
+        if (name != NULL)
+          set_string(name, pe->object, "assembly.name");
+
+        // Culture comes after Name.
+        if (index_sizes.string == 4)
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              *(DWORD*) (row_ptr + 4 + 2 + 2 + 2 + 2 + 4 + index_sizes.blob + index_sizes.string));
+        else
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              *(WORD*) (row_ptr + 4 + 2 + 2 + 2 + 2 + 4 + index_sizes.blob + index_sizes.string));
+
+        // Sometimes it will be a zero length string. This is technically
+        // against the specification but happens from time to time.
+        if (name != NULL && strlen(name) > 0)
+          set_string(name, pe->object, "assembly.culture");
+
+        table_offset += row_size * num_rows;
+        break;
+      case BIT_ASSEMBLYPROCESSOR:
+        table_offset += (4) * num_rows;
+        break;
+      case BIT_ASSEMBLYOS:
+        table_offset += (4 + 4 + 4) * num_rows;
+        break;
+      case BIT_ASSEMBLYREF:
+        table_offset += (2 + 2 + 2 + 2 + 4 + (index_sizes.blob * 2) + (index_sizes.string * 2)) * num_rows;
+        break;
+      case BIT_ASSEMBLYREFPROCESSOR:
+        table_offset += (4 + index_sizes.assemblyrefprocessor) * num_rows;
+        break;
+      case BIT_ASSEMBLYREFOS:
+        table_offset += (4 + 4 + 4 + index_sizes.assemblyref) * num_rows;
+        break;
+      case BIT_FILE:
+        table_offset += (4 + index_sizes.string + index_sizes.blob) * num_rows;
+        break;
+      case BIT_EXPORTEDTYPE:
+        row_count = max_rows(3, rows.file, rows.assemblyref, rows.exportedtype);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (4 + 4 + (index_sizes.string * 2) + index_size) * num_rows;
+        break;
+      case BIT_MANIFESTRESOURCE:
+        // This is an Implementation coded index with no 3rd bit specified.
+        row_count = max_rows(2, rows.file, rows.assemblyref);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        row_size = (4 + 4 + index_sizes.string + index_size);
+
+        // Using 'i' is insufficent since we may skip certain resources and
+        // it would give an inaccurate count in that case.
+        counter = 0;
+        row_ptr = table_offset;
+        // First DWORD is the offset.
+        for (i = 0; i < num_rows; i++)
+        {
+          if (!fits_in_pe(pe, row_ptr, row_size))
+            break;
+
+          manifestresource_table = (PMANIFESTRESOURCE_TABLE) row_ptr;
+          resource_offset = manifestresource_table->Offset;
+
+          // Only set offset if it is in this file (implementation != 0).
+          // Can't use manifestresource_table here because the Name and
+          // Implementation fields are variable size.
+          if (index_size == 4)
+            implementation = *(DWORD*) (row_ptr + 4 + 4 + index_sizes.string);
+          else
+            implementation = *(WORD*) (row_ptr + 4 + 4 + index_sizes.string);
+
+          if (implementation != 0)
+          {
+            row_ptr += row_size;
+            continue;
+          }
+
+          if (!fits_in_pe(pe, pe->data + resource_base + resource_offset, sizeof(DWORD)))
+          {
+            row_ptr += row_size;
+            continue;
+          }
+
+          resource_size = *(DWORD*) (pe->data + resource_base + resource_offset);
+
+          if (!fits_in_pe(pe, pe->data + resource_base + resource_offset, resource_size))
+          {
+            row_ptr += row_size;
+            continue;
+          }
+
+          // Add 4 to skip the size.
+          set_integer(resource_base + resource_offset + 4,
+              pe->object, "resources[%i].offset", i);
+
+          set_integer(resource_size,
+              pe->object, "resources[%i].length", i);
+
+          name = pe_get_dotnet_string(pe,
+              string_offset,
+              DOTNET_STRING_INDEX(manifestresource_table->Name));
+          if (name != NULL)
+            set_string(name, pe->object, "resources[%i].name", i);
+
+          row_ptr += row_size;
+          counter++;
+        }
+
+        set_integer(counter, pe->object, "number_of_resources");
+
+        table_offset += row_size * num_rows;
+        break;
+      case BIT_NESTEDCLASS:
+        table_offset += (index_sizes.typedef_ * 2) * num_rows;
+        break;
+      case BIT_GENERICPARAM:
+        row_count = max_rows(2, rows.typedef_, rows.methoddef);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (2 + 2 + index_size + index_sizes.string) * num_rows;
+        break;
+      case BIT_METHODSPEC:
+        row_count = max_rows(2, rows.methoddef, rows.memberref);
+
+        if (row_count > (0xFFFF >> 0x01))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (index_size + index_sizes.blob) * num_rows;
+        break;
+      case BIT_GENERICPARAMCONSTRAINT:
+        row_count = max_rows(3, rows.typedef_, rows.typeref, rows.typespec);
+
+        if (row_count > (0xFFFF >> 0x02))
+          index_size = 4;
+        else
+          index_size = 2;
+
+        table_offset += (index_sizes.genericparam + index_size) * num_rows;
+        break;
+      default:
+        //printf("Unknown bit: %i\n", bit_check);
+        return;
+    }
+
+    matched_bits++;
+  }
+}
+
+
+// Parsing the #~ stream is done in two parts. The first part (this function)
+// parses enough of the Stream to provide context for the second pass. In
+// particular it is collecting the number of rows for each of the tables. The
+// second part parses the actual tables of interest.
+void dotnet_parse_tilde(
+    PE* pe,
+    int64_t metadata_root,
+    PCLI_HEADER cli_header,
+    PSTREAMS streams)
+{
+  PTILDE_HEADER tilde_header;
+  int64_t resource_base;
+  uint32_t* row_offset = NULL;
+  int bit_check;
+  // This is used as an offset into the rows and tables. For every bit set in
+  // Valid this will be incremented. This is because the bit position doesn't
+  // matter, just the number of bits that are set, when determining how many
+  // rows and what the table structure is.
+  int matched_bits = 0;
+  // We need to know the number of rows for some tables, because they are
+  // indexed into. The index will be either 2 or 4 bytes, depending upon the
+  // number of rows being indexed into.
+  ROWS rows;
+  INDEX_SIZES index_sizes;
+
+  // Default all rows to 0. They will be set to actual values later on, if
+  // they exist in the file.
+  memset(&rows, '\0', sizeof(ROWS));
+
+  // Default index sizes are 2. Will be bumped to 4 if necessary.
+  memset(&index_sizes, 2, sizeof(index_sizes));
+
+  tilde_header = (PTILDE_HEADER) (pe->data + metadata_root + streams->tilde->Offset);
+
+  if (!struct_fits_in_pe(pe, tilde_header, TILDE_HEADER))
+      return;
+
+  // Set index sizes for various heaps.
+  if (tilde_header->HeapSizes & 0x01)
+    index_sizes.string = 4;
+  if (tilde_header->HeapSizes & 0x02)
+    index_sizes.guid = 4;
+  if (tilde_header->HeapSizes & 0x04)
+    index_sizes.blob = 4;
+
+  // Immediately after the tilde header is an array of 32bit values which
+  // indicate how many rows are in each table. The tables are immediately
+  // after the rows array.
+  //
+  // Save the row offset.
+  row_offset = (uint32_t*) (tilde_header + 1);
+
+  // Walk all the bits first because we need to know the number of rows for
+  // some tables in order to parse others. In particular this applies to
+  // coded indexes, which are documented in ECMA-335 II.24.2.6.
+  for (bit_check = 0; bit_check < 64; bit_check++)
+  {
+    if (!((tilde_header->Valid >> bit_check) & 0x01))
+      continue;
+
+#define ROW_CHECK(name) \
+    rows.name = *(row_offset + matched_bits);
+
+#define ROW_CHECK_WITH_INDEX(name) \
+    ROW_CHECK(name); \
+    if (rows.name > 0xFFFF) \
+      index_sizes.name = 4;
+
+    switch (bit_check)
+    {
+      case BIT_MODULE:
+        ROW_CHECK(module);
+        break;
+      case BIT_MODULEREF:
+        ROW_CHECK_WITH_INDEX(moduleref);
+        break;
+      case BIT_ASSEMBLYREF:
+        ROW_CHECK_WITH_INDEX(assemblyref);
+        break;
+      case BIT_ASSEMBLYREFPROCESSOR:
+        ROW_CHECK_WITH_INDEX(assemblyrefprocessor);
+        break;
+      case BIT_TYPEREF:
+        ROW_CHECK(typeref);
+        break;
+      case BIT_METHODDEF:
+        ROW_CHECK_WITH_INDEX(methoddef);
+        break;
+      case BIT_MEMBERREF:
+        ROW_CHECK_WITH_INDEX(memberref);
+        break;
+      case BIT_TYPEDEF:
+        ROW_CHECK_WITH_INDEX(typedef_);
+        break;
+      case BIT_TYPESPEC:
+        ROW_CHECK(typespec);
+        break;
+      case BIT_FIELD:
+        ROW_CHECK_WITH_INDEX(field);
+        break;
+      case BIT_PARAM:
+        ROW_CHECK_WITH_INDEX(param);
+        break;
+      case BIT_PROPERTY:
+        ROW_CHECK_WITH_INDEX(property);
+        break;
+      case BIT_INTERFACEIMPL:
+        ROW_CHECK(interfaceimpl);
+        break;
+      case BIT_EVENT:
+        ROW_CHECK_WITH_INDEX(event);
+        break;
+      case BIT_STANDALONESIG:
+        ROW_CHECK(standalonesig);
+        break;
+      case BIT_ASSEMBLY:
+        ROW_CHECK(assembly);
+        break;
+      case BIT_FILE:
+        ROW_CHECK(file);
+        break;
+      case BIT_EXPORTEDTYPE:
+        ROW_CHECK(exportedtype);
+        break;
+      case BIT_MANIFESTRESOURCE:
+        ROW_CHECK(manifestresource);
+        break;
+      case BIT_GENERICPARAM:
+        ROW_CHECK_WITH_INDEX(genericparam);
+        break;
+      case BIT_GENERICPARAMCONSTRAINT:
+        ROW_CHECK(genericparamconstraint);
+        break;
+      case BIT_METHODSPEC:
+        ROW_CHECK(methodspec);
+        break;
+      default:
+        break;
+    }
+
+    matched_bits++;
+  }
+
+  // This is used when parsing the MANIFEST RESOURCE table.
+  resource_base = pe_rva_to_offset(pe, cli_header->Resources.VirtualAddress);
+
+  dotnet_parse_tilde_2(pe,
+                       tilde_header,
+                       resource_base,
+                       metadata_root,
+                       rows,
+                       index_sizes,
+                       streams);
+}
+
+void dotnet_parse_com(
+    PE* pe,
+    size_t base_address)
+{
+  PIMAGE_DATA_DIRECTORY directory;
+  PCLI_HEADER cli_header;
+  PNET_METADATA metadata;
+  int64_t metadata_root, offset;
+  char *version;
+  STREAMS headers;
+  WORD num_streams;
+
+  directory = pe_get_directory_entry(pe, IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR);
+
+  offset = pe_rva_to_offset(pe, directory->VirtualAddress);
+
+  if (offset < 0 || !struct_fits_in_pe(pe, pe->data + offset, CLI_HEADER))
+    return;
+
+  cli_header = (PCLI_HEADER) (pe->data + offset);
+
+  offset = metadata_root = pe_rva_to_offset(pe, cli_header->MetaData.VirtualAddress);
+  if (!struct_fits_in_pe(pe, pe->data + offset, NET_METADATA))
+    return;
+
+  metadata = (PNET_METADATA) (pe->data + offset);
+
+  if (metadata->Magic != NET_METADATA_MAGIC)
+    return;
+
+  // Version length must be between 1 and 255, and be a multiple of 4.
+  // Also make sure it fits in pe.
+  if (metadata->Length == 0 ||
+      metadata->Length > 255 ||
+      metadata->Length % 4 != 0 ||
+      !fits_in_pe(pe, pe->data + offset, metadata->Length))
+    return;
+
+  version = (char*) yr_malloc(metadata->Length + 1);
+
+  if (!version)
+    return;
+
+  strncpy(version, metadata->Version, metadata->Length);
+  set_string(version, pe->object, "version");
+  yr_free(version);
+
+  // The metadata structure has some variable length records after the version.
+  // We must manually parse things from here on out.
+  //
+  // Flags are 2 bytes (always 0).
+  offset += sizeof(NET_METADATA) + metadata->Length + 2;
+
+  // 2 bytes for Streams.
+  if (!fits_in_pe(pe, pe->data + offset, 2))
+    return;
+
+  num_streams = (WORD) *(pe->data + offset);
+  offset += 2;
+
+  headers = dotnet_parse_stream_headers(pe, offset, metadata_root, num_streams);
+
+  if (headers.guid != NULL)
+    dotnet_parse_guid(pe, metadata_root, headers.guid);
+
+  // Parse the #~ stream, which includes various tables of interest.
+  if (headers.tilde != NULL)
+    dotnet_parse_tilde(pe, metadata_root, cli_header, &headers);
+}
+
+
+begin_declarations;
+
+  declare_string("version");
+  declare_string("module_name");
+  begin_struct_array("streams");
+    declare_string("name");
+    declare_integer("offset");
+    declare_integer("size");
+  end_struct_array("streams");
+  declare_integer("number_of_streams");
+  declare_string_array("guids");
+  declare_integer("number_of_guids");
+  begin_struct_array("resources");
+    declare_integer("offset");
+    declare_integer("length");
+    declare_string("name");
+  end_struct_array("resources");
+  declare_integer("number_of_resources");
+  begin_struct("assembly");
+    begin_struct("version");
+      declare_integer("major");
+      declare_integer("minor");
+      declare_integer("build_number");
+      declare_integer("revision_number");
+    end_struct("version");
+    declare_string("name");
+    declare_string("culture");
+  end_struct("assembly");
+  declare_string_array("modulerefs");
+  declare_integer("number_of_modulerefs");
+  declare_string("typelib");
+
+end_declarations;
+
+
+int module_initialize(
+    YR_MODULE* module)
+{
+  return ERROR_SUCCESS;
+}
+
+
+int module_finalize(
+    YR_MODULE* module)
+{
+  return ERROR_SUCCESS;
+}
+
+
+int module_load(
+    YR_SCAN_CONTEXT* context,
+    YR_OBJECT* module_object,
+    void* module_data,
+    size_t module_data_size)
+{
+  YR_MEMORY_BLOCK* block;
+  YR_MEMORY_BLOCK_ITERATOR* iterator = context->iterator;
+  uint8_t* block_data = NULL;
+
+  foreach_memory_block(iterator, block)
+  {
+	block_data = block->fetch_data(block);
+
+    if (block_data == NULL)
+      continue;
+
+    PIMAGE_NT_HEADERS32 pe_header = pe_get_header(block_data, block->size);
+
+    if (pe_header != NULL)
+    {
+      // Ignore DLLs while scanning a process
+
+      if (!(context->flags & SCAN_FLAGS_PROCESS_MEMORY) ||
+          !(pe_header->FileHeader.Characteristics & IMAGE_FILE_DLL))
+      {
+        PE* pe = (PE*) yr_malloc(sizeof(PE));
+
+        if (pe == NULL)
+          return ERROR_INSUFICIENT_MEMORY;
+
+        pe->data = block_data;
+        pe->data_size = block->size;
+        pe->object = module_object;
+        pe->header = pe_header;
+
+        module_object->data = pe;
+
+        dotnet_parse_com(pe, block->base);
+
+        break;
+      }
+    }
+  }
+
+  return ERROR_SUCCESS;
+}
+
+
+int module_unload(
+    YR_OBJECT* module_object)
+{
+  PE* pe = (PE *) module_object->data;
+
+  if (pe == NULL)
+    return ERROR_SUCCESS;
+
+  yr_free(pe);
+
+  return ERROR_SUCCESS;
+}
diff --git a/libyara/modules/module_list b/libyara/modules/module_list
index e9d2682..b35346d 100644
--- a/libyara/modules/module_list
+++ b/libyara/modules/module_list
@@ -2,6 +2,7 @@ MODULE(tests)
 MODULE(pe)
 MODULE(elf)
 MODULE(math)
+MODULE(dotnet)
 
 #ifdef CUCKOO_MODULE
 MODULE(cuckoo)
diff --git a/libyara/modules/pe.c b/libyara/modules/pe.c
index 1af5562..8c4cb12 100644
--- a/libyara/modules/pe.c
+++ b/libyara/modules/pe.c
@@ -52,7 +52,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 #include <yara/mem.h>
 #include <yara/strutils.h>
 
-#include "pe_utils.c"
+#include <yara/pe_utils.h>
 
 #define MODULE_NAME pe
 
@@ -88,7 +88,6 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 #define RESOURCE_ITERATOR_ABORTED    1
 
 
-#define MAX_PE_SECTIONS              96
 #define MAX_PE_IMPORTS               16384
 #define MAX_PE_EXPORTS               65535
 
@@ -101,24 +100,10 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     ((entry)->OffsetToData & 0x7FFFFFFF)
 
 
-#define IS_64BITS_PE(pe) \
-    (pe->header64->OptionalHeader.Magic == IMAGE_NT_OPTIONAL_HDR64_MAGIC)
-
-
 #define available_space(pe, pointer) \
     (pe->data + pe->data_size - (uint8_t*)(pointer))
 
 
-#define fits_in_pe(pe, pointer, size) \
-    ((size_t) size <= pe->data_size && \
-     (uint8_t*) (pointer) >= pe->data && \
-     (uint8_t*) (pointer) <= pe->data + pe->data_size - size)
-
-
-#define struct_fits_in_pe(pe, pointer, struct_type) \
-    fits_in_pe(pe, pointer, sizeof(struct_type))
-
-
 typedef int (*RESOURCE_CALLBACK_FUNC) ( \
      PIMAGE_RESOURCE_DATA_ENTRY rsrc_data, \
      int rsrc_type, \
@@ -130,50 +115,6 @@ typedef int (*RESOURCE_CALLBACK_FUNC) ( \
      void* cb_data);
 
 
-//
-// Imports are stored in a linked list. Each node (IMPORTED_DLL) contains the
-// name of the DLL and a pointer to another linked list of IMPORTED_FUNCTION
-// structures containing the names of imported functions.
-//
-
-typedef struct _IMPORTED_DLL
-{
-  char *name;
-
-  struct _IMPORTED_FUNCTION *functions;
-  struct _IMPORTED_DLL *next;
-
-} IMPORTED_DLL, *PIMPORTED_DLL;
-
-
-typedef struct _IMPORTED_FUNCTION
-{
-  char *name;
-  uint8_t has_ordinal;
-  uint16_t ordinal;
-
-  struct _IMPORTED_FUNCTION *next;
-
-} IMPORTED_FUNCTION, *PIMPORTED_FUNCTION;
-
-
-typedef struct _PE
-{
-  uint8_t* data;
-  size_t data_size;
-
-  union {
-    PIMAGE_NT_HEADERS32 header;
-    PIMAGE_NT_HEADERS64 header64;
-  };
-
-  YR_OBJECT* object;
-  IMPORTED_DLL* imported_dlls;
-  uint32_t resources;
-
-} PE;
-
-
 int wide_string_fits_in_pe(
     PE* pe,
     char* data)
@@ -193,71 +134,6 @@ int wide_string_fits_in_pe(
 }
 
 
-PIMAGE_NT_HEADERS32 pe_get_header(
-    uint8_t* data,
-    size_t data_size)
-{
-  PIMAGE_DOS_HEADER mz_header;
-  PIMAGE_NT_HEADERS32 pe_header;
-
-  size_t headers_size = 0;
-
-  if (data_size < sizeof(IMAGE_DOS_HEADER))
-    return NULL;
-
-  mz_header = (PIMAGE_DOS_HEADER) data;
-
-  if (mz_header->e_magic != IMAGE_DOS_SIGNATURE)
-    return NULL;
-
-  if (mz_header->e_lfanew < 0)
-    return NULL;
-
-  headers_size = mz_header->e_lfanew + \
-                 sizeof(pe_header->Signature) + \
-                 sizeof(IMAGE_FILE_HEADER);
-
-  if (data_size < headers_size)
-    return NULL;
-
-  pe_header = (PIMAGE_NT_HEADERS32) (data + mz_header->e_lfanew);
-
-  headers_size += pe_header->FileHeader.SizeOfOptionalHeader;
-
-  if (pe_header->Signature == IMAGE_NT_SIGNATURE &&
-      (pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_UNKNOWN ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_AM33 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_AMD64 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARM ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARMNT ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARM64 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_EBC ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_I386 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_IA64 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_M32R ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPS16 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPSFPU ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPSFPU16 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_POWERPC ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_POWERPCFP ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_R4000 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH3 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH3DSP ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH4 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH5 ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_THUMB ||
-       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_WCEMIPSV2) &&
-      data_size > headers_size)
-  {
-    return pe_header;
-  }
-  else
-  {
-    return NULL;
-  }
-}
-
-
 // Parse the rich signature.
 // http://www.ntcore.com/files/richsign.htm
 
@@ -372,119 +248,6 @@ void pe_parse_rich_signature(
 }
 
 
-PIMAGE_DATA_DIRECTORY pe_get_directory_entry(
-    PE* pe,
-    int entry)
-{
-  PIMAGE_DATA_DIRECTORY result;
-
-  if (IS_64BITS_PE(pe))
-    result = &pe->header64->OptionalHeader.DataDirectory[entry];
-  else
-    result = &pe->header->OptionalHeader.DataDirectory[entry];
-
-  return result;
-}
-
-
-#define OptionalHeader(pe,field)                \
-  (IS_64BITS_PE(pe) ?                           \
-   pe->header64->OptionalHeader.field :         \
-   pe->header->OptionalHeader.field)
-
-
-int64_t pe_rva_to_offset(
-    PE* pe,
-    uint64_t rva)
-{
-  PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(pe->header);
-
-  DWORD lowest_section_rva = 0xffffffff;
-  DWORD section_rva = 0;
-  DWORD section_offset = 0;
-  DWORD section_raw_size = 0;
-
-  int64_t result;
-
-  int i = 0;
-
-  int alignment = 0;
-  int rest = 0;
-
-  while(i < yr_min(pe->header->FileHeader.NumberOfSections, MAX_PE_SECTIONS))
-  {
-    if (struct_fits_in_pe(pe, section, IMAGE_SECTION_HEADER))
-    {
-      if (lowest_section_rva > section->VirtualAddress)
-      {
-        lowest_section_rva = section->VirtualAddress;
-      }
-
-      if (rva >= section->VirtualAddress &&
-          section_rva <= section->VirtualAddress)
-      {
-        // Round section_offset
-        //
-        // Rounding everything less than 0x200 to 0 as discussed in
-        // https://code.google.com/archive/p/corkami/wikis/PE.wiki#PointerToRawData
-        // does not work for PE32_FILE from the test suite and for
-        // some tinype samples where File Alignment = 4
-        // (http://www.phreedom.org/research/tinype/).
-        //
-        // If FileAlignment is >= 0x200, it is apparently ignored (see
-        // Ero Carreras's pefile.py, PE.adjust_FileAlignment).
-
-        alignment = yr_min(OptionalHeader(pe, FileAlignment), 0x200);
-
-        section_rva = section->VirtualAddress;
-        section_offset = section->PointerToRawData;
-        section_raw_size = section->SizeOfRawData;
-
-        if (alignment)
-        {
-          rest = section_offset % alignment;
-
-          if (rest)
-            section_offset -= rest;
-        }
-      }
-
-      section++;
-      i++;
-    }
-    else
-    {
-      return -1;
-    }
-  }
-
-  // Everything before the first section seems to get mapped straight
-  // relative to ImageBase.
-
-  if (rva < lowest_section_rva)
-  {
-    section_rva = 0;
-    section_offset = 0;
-    section_raw_size = (DWORD) pe->data_size;
-  }
-
-  // Many sections, have a raw (on disk) size smaller than their in-memory size.
-  // Check for rva's that map to this sparse space, and therefore have no valid
-  // associated file offset.
-
-  if ((rva - section_rva) >= section_raw_size)
-    return -1;
-
-  result = section_offset + (rva - section_rva);
-
-  // Check that the offset fits within the file.
-  if (result >= pe->data_size)
-    return -1;
-
-  return result;
-}
-
-
 // Return a pointer to the resource directory string or NULL.
 // The callback function will parse this and call set_sized_string().
 // The pointer is guranteed to have enough space to contain the entire string.
diff --git a/libyara/modules/pe_utils.c b/libyara/modules/pe_utils.c
index 81a7ae4..6873927 100644
--- a/libyara/modules/pe_utils.c
+++ b/libyara/modules/pe_utils.c
@@ -1,3 +1,192 @@
+#include <string.h>
+#include <strings.h>
+
+#include <yara/utils.h>
+#include <yara/strutils.h>
+#include <yara/mem.h>
+#include <yara/pe_utils.h>
+#include <yara/pe.h>
+
+#if HAVE_LIBCRYPTO
+#include <openssl/asn1.h>
+#endif
+
+PIMAGE_NT_HEADERS32 pe_get_header(
+    uint8_t* data,
+    size_t data_size)
+{
+  PIMAGE_DOS_HEADER mz_header;
+  PIMAGE_NT_HEADERS32 pe_header;
+
+  size_t headers_size = 0;
+
+  if (data_size < sizeof(IMAGE_DOS_HEADER))
+    return NULL;
+
+  mz_header = (PIMAGE_DOS_HEADER) data;
+
+  if (mz_header->e_magic != IMAGE_DOS_SIGNATURE)
+    return NULL;
+
+  if (mz_header->e_lfanew < 0)
+    return NULL;
+
+  headers_size = mz_header->e_lfanew + \
+                 sizeof(pe_header->Signature) + \
+                 sizeof(IMAGE_FILE_HEADER);
+
+  if (data_size < headers_size)
+    return NULL;
+
+  pe_header = (PIMAGE_NT_HEADERS32) (data + mz_header->e_lfanew);
+
+  headers_size += pe_header->FileHeader.SizeOfOptionalHeader;
+
+  if (pe_header->Signature == IMAGE_NT_SIGNATURE &&
+      (pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_UNKNOWN ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_AM33 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_AMD64 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARM ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARMNT ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_ARM64 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_EBC ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_I386 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_IA64 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_M32R ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPS16 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPSFPU ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_MIPSFPU16 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_POWERPC ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_POWERPCFP ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_R4000 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH3 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH3DSP ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH4 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_SH5 ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_THUMB ||
+       pe_header->FileHeader.Machine == IMAGE_FILE_MACHINE_WCEMIPSV2) &&
+      data_size > headers_size)
+  {
+    return pe_header;
+  }
+  else
+  {
+    return NULL;
+  }
+}
+
+
+PIMAGE_DATA_DIRECTORY pe_get_directory_entry(
+    PE* pe,
+    int entry)
+{
+  PIMAGE_DATA_DIRECTORY result;
+
+  if (IS_64BITS_PE(pe))
+    result = &pe->header64->OptionalHeader.DataDirectory[entry];
+  else
+    result = &pe->header->OptionalHeader.DataDirectory[entry];
+
+  return result;
+}
+
+
+#define OptionalHeader(pe,field)                \
+  (IS_64BITS_PE(pe) ?                           \
+   pe->header64->OptionalHeader.field :         \
+   pe->header->OptionalHeader.field)
+
+
+int64_t pe_rva_to_offset(
+    PE* pe,
+    uint64_t rva)
+{
+  PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(pe->header);
+
+  DWORD lowest_section_rva = 0xffffffff;
+  DWORD section_rva = 0;
+  DWORD section_offset = 0;
+  DWORD section_raw_size = 0;
+
+  int64_t result;
+
+  int i = 0;
+
+  int alignment = 0;
+  int rest = 0;
+
+  while(i < yr_min(pe->header->FileHeader.NumberOfSections, MAX_PE_SECTIONS))
+  {
+    if (struct_fits_in_pe(pe, section, IMAGE_SECTION_HEADER))
+    {
+      if (lowest_section_rva > section->VirtualAddress)
+      {
+        lowest_section_rva = section->VirtualAddress;
+      }
+
+      if (rva >= section->VirtualAddress &&
+          section_rva <= section->VirtualAddress)
+      {
+        // Round section_offset
+        //
+        // Rounding everything less than 0x200 to 0 as discussed in
+        // https://code.google.com/archive/p/corkami/wikis/PE.wiki#PointerToRawData
+        // does not work for PE32_FILE from the test suite and for
+        // some tinype samples where File Alignment = 4
+        // (http://www.phreedom.org/research/tinype/).
+        //
+        // If FileAlignment is >= 0x200, it is apparently ignored (see
+        // Ero Carreras's pefile.py, PE.adjust_FileAlignment).
+
+        alignment = yr_min(OptionalHeader(pe, FileAlignment), 0x200);
+
+        section_rva = section->VirtualAddress;
+        section_offset = section->PointerToRawData;
+        section_raw_size = section->SizeOfRawData;
+
+        if (alignment)
+        {
+          rest = section_offset % alignment;
+
+          if (rest)
+            section_offset -= rest;
+        }
+      }
+
+      section++;
+      i++;
+    }
+    else
+    {
+      return -1;
+    }
+  }
+
+  // Everything before the first section seems to get mapped straight
+  // relative to ImageBase.
+
+  if (rva < lowest_section_rva)
+  {
+    section_rva = 0;
+    section_offset = 0;
+    section_raw_size = (DWORD) pe->data_size;
+  }
+
+  // Many sections, have a raw (on disk) size smaller than their in-memory size.
+  // Check for rva's that map to this sparse space, and therefore have no valid
+  // associated file offset.
+
+  if ((rva - section_rva) >= section_raw_size)
+    return -1;
+
+  result = section_offset + (rva - section_rva);
+
+  // Check that the offset fits within the file.
+  if (result >= pe->data_size)
+    return -1;
+
+  return result;
+}
 
 
 #include <stdio.h>
@@ -57,7 +246,7 @@ time_t timegm(
 // Taken from http://stackoverflow.com/questions/10975542/asn1-time-conversion
 // and cleaned up. Also uses timegm(3) instead of mktime(3).
 
-static time_t ASN1_get_time_t(
+time_t ASN1_get_time_t(
   	ASN1_TIME* time)
 {
   struct tm t;
@@ -105,7 +294,7 @@ static time_t ASN1_get_time_t(
 // "ordN" and if that fails, return NULL. The caller is responsible for freeing
 // the returned string.
 
-static char *ord_lookup(
+char *ord_lookup(
     char *dll,
     uint16_t ord)
 {

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/forensics/yara.git



More information about the forensics-changes mailing list