Design of GPT-based Cybersecurity Tools

Recently, GPT Plus has been launched, and I feel that the performance of GPT-4 is indeed much stronger than that of 3.5. It's not just about context capabilities and word limits; the logical abilities have also improved significantly. Finally, I can use it to liberate productivity, which led to this article. I am not a professional developer, and my code is quite poor, so please be gentle.

01#

As we all know, a large part of the work in security positions involves writing reports. Often, a vulnerability can be discovered in five minutes, but writing the report takes fifteen minutes. A report must be written according to a specific format and syntax, which can be quite torturous. Therefore, we need to let AI free us from this pain.

Design Ideas#

Have a UI interface
Facilitate file processing
Use AI to handle language description issues
Allow manual intervention in results

First, I wrote a UI interface. To speed up development, I chose Python and opted for web development for lightweight implementation.

Starting with Flask

app = Flask(__name__, template_folder='templates', static_folder='static')

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        vuln_name = request.form.get('vuln_name', default="SQL Injection")
        vuln_point = request.form.get('vuln_point', default="www.google.com")
        beizhu = request.form.get('beizhu', default="")
        language = request.form.get('language')
        start = time.time()
        report = generate_report(vuln_name, vuln_point, beizhu, language)
        end = time.time()
        times = end - start

        return render_template('index.html', report=report, times=times)
    else:
        return render_template('index.html')

I created a default route and loaded the template index.html. The main body of index.html is a form.

Form:

Vulnerability Name (e.g., SQL Injection)
Vulnerability Point (e.g., www.test.com/id=111)
Remarks (such as more details in the report, provide xx POCs, etc.)

Next, I wrote an HTML file to simply implement this page

<!DOCTYPE html>
<html>

<head>
  <meta charset="UTF-8">
  <title>Vulnerability Report Generator</title>
  <style>
    body {
      font-family: Arial, sans-serif;
      margin: 0;
      padding: 0;
    }

    header {
      background-color: #005293;
      color: #fff;
      padding: 10px;
    }

    h1 {
      margin: 0;
    }

    form {
      margin: 20px;
    }

    label {
      display: block;
      margin-bottom: 10px;
    }

    input[type="text"] {
      width: 100%;
      padding: 10px;
      border: 1px solid #ccc;
      border-radius: 5px;
      margin-bottom: 20px;
      font-size: 16px;
    }

    input[type="submit"] {
      background-color: #005293;
      color: #fff;
      border: none;
      padding: 10px;
      border-radius: 5px;
      cursor: pointer;
      font-size: 16px;
    }

    #report {
      margin: 20px;
      border: 1px solid #ccc;
      padding: 10px;
      border-radius: 5px;
      font-size: 16px;
    }
  </style>

  <link rel="stylesheet"
    href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.7.0/styles/panda-syntax-dark.min.css">

  <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.7.0/highlight.min.js"
    integrity="sha512-bgHRAiTjGrzHzLyKOnpFvaEpGzJet3z4tZnXGjpsCcqOnAH6VGUx9frc5bcIhKTVLEiCO6vEhNAgx5jtLUYrfA=="
    crossorigin="anonymous" referrerpolicy="no-referrer"></script>

  <script>hljs.initHighlightingOnLoad();</script>
  <script src="https://cdn.jsdelivr.net/npm/[email protected]/base64.min.js"></script>
</head>

<body>
  <header>
    <h1>Vulnerability Report Generator</h1>

  </header>

  <form method="POST" action="/">

    <select name="language">
      <option value="中文">Chinese</option>
      <option value="英文">English</option>
    </select>

    <label>Vulnerability Name:</label>
    <input type="text" name="vuln_name" placeholder="Please enter the vulnerability name">
    <label>Vulnerability Point:</label>
    <input type="text" name="vuln_point" placeholder="Please enter the vulnerability point">
    <label>Additional Requirements for the Report:</label>
    <input type="text" name="beizhu" placeholder="Additional requirements for the report">

    <input type="submit" value="Generate Vulnerability Report"> <button id="copy-btn" onclick="copyReport()" type="button">Copy</button><!--<button id="copy-btn" onclick="saveReport()" type="button">Generate File</button>-->
    Time taken for this execution: {{ times }} seconds
  </form>


  {% if report %}

  <pre id="report" style="white-space: pre-wrap;overflow-wrap: break-word;">
        <code>
      {{ report }}
      </code>
    </pre>


  <br>

  {% if report %}

  {% endif %}

  <script>
    function copyReport() {
      var reportText = document.querySelector("#report code").innerText;
      navigator.clipboard.writeText(reportText)
        .then(function () {
          alert("Vulnerability report has been copied to clipboard!");
        })
        .catch(function (error) {
          alert("Copy failed, please copy manually.");
        });
    }
  </script>

  {% endif %}
</body>

</html>

The page looks something like this, and I will improve some small details.

Next is file processing. Is there a file or encoding format that is pleasant to read and can be parsed by browser text editors? Yes, Markdown is the first choice. So I added code highlighting in index.html, and the generated value is directly filled into it.

Final implementation effect:

Then there is manual control, but this has already been accomplished by the remarks option, so I won't elaborate further.

Finally, the most critical part

Integrating with the ChatGPT API to obtain information. Here, I recommend a GitHub project where you can find development materials about ChatGPT: https://github.com/easychen/openai-gpt-dev-notes-for-cn-developer

Next, I wrote the logic code

def generate_report(vuln_name, vuln_point, beihzu="", language=""):
    # Construct GPT-3 input
    prompt = f"" # Fill in the adjusted prompt here

    #api_base = {"socks5":proxy}
    api_base = {"http": proxy, "https": proxy}

    # Call OpenAI API to generate vulnerability report
    headers = {
        # Already added when you pass json= but not when you pass data=
        'Content-Type': 'application/json',
        'Authorization': "Bearer "+api_key,
    }

    json_data = {
        'model': 'gpt-3.5-turbo',
        'messages': [
            {
                'role': 'user',
                'content': prompt,
            },
        ],
    }
    # Set proxy
    # Calculate time

    response = requests.post('https://api.openai.com/v1/chat/completions',
                             headers=headers, json=json_data, proxies=api_base)

    # Process OpenAI API response
    if response.status_code == 200:
        text = response.text
        text = json.loads(text)
        text = text['choices'][0]['message']['content']
        report = f"{text}"

    else:
        report = "Failed to generate vulnerability report, please check the input and try again. Status code: "+str(response.status_code)

    return report

Here, I did not use the official method for calling but used HTTP request interface for convenience with the proxy.

The general logic is like this, and I will also write a file saving function

def outfile():
    # Get the base64 encoded data posted
    date=request.form.get('reportText')
    
    # Decode the base64 data
    date=base64.b64decode(date,).decode('utf-8')
    # Write the data to an md file, with the filename as the current time
    fname=time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
    with open(fname+'.md','w') as f:
        f.write(date)
    return "ok"

That's about it, and here is the final effect picture

Finally, to wrap it all up, the project is hosted at
https://github.com/shiyeshu/GPTreport

If you like it, please give it a star.

02#

Actually, this is the main event, but the company has a strange requirement. I need to register a patent by the end of April as a sacrifice to the company. I have no choice but to prepare for this sacrifice.

Let me briefly discuss my thoughts.

I want to create an AI code auditing tool, which will definitely be more complex and practical than the previous one.

First, AI cannot perform full-text analysis of the entire source code; it can only analyze small segments. Therefore, I want to use some means for preprocessing. By using vulnerability matching rules from some software for pre-scanning, we can identify lines of code that have risk hazards. Then, for that line of code, there are two operations: determine whether it is within a function. If it is a function, perform a whole function analysis; if not, perform context linkage analysis within a threshold. A more advanced approach could involve tracking variable assignment operations. This would essentially complete the entire program auditing logic. There are many directions to explore, but I won't elaborate further here. The code is already written and is currently in my hands. If the company does not urge me to submit it after April, I will open-source it.

Here’s a small preview of the tool

Finally, I don't often write articles, so my writing may be poor and disorganized. I hope everyone can understand. Some of the topics mentioned in this article are quite popular, so please view them purely from a technical perspective. Embrace open source but do not be a freeloader.