Skip to content

Commit 5b65518

Browse files
authored
Merge pull request #81 from ipa-lab/development
merge the current development branch into master
2 parents 4ea46ea + aafabf1 commit 5b65518

39 files changed

+2422
-659
lines changed

.github/workflows/python-app.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ name: Python application
55

66
on:
77
push:
8-
branches: [ "main" ]
8+
branches: [ "main", "development" ]
99
pull_request:
10-
branches: [ "main" ]
10+
branches: [ "main", "development" ]
1111

1212
permissions:
1313
contents: read

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,6 @@ src/hackingBuddyGPT.egg-info/
1111
build/
1212
dist/
1313
.coverage
14+
src/hackingBuddyGPT/usecases/web_api_testing/openapi_spec/
15+
src/hackingBuddyGPT/usecases/web_api_testing/converted_files/
16+
/src/hackingBuddyGPT/usecases/web_api_testing/utils/openapi_spec/

README.md

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,18 @@ HackingBuddyGPT helps security researchers use LLMs to discover new attack vecto
88

99
We aim to become **THE go-to framework for security researchers** and pen-testers interested in using LLMs or LLM-based autonomous agents for security testing. To aid their experiments, we also offer re-usable [linux priv-esc benchmarks](https://github.com/ipa-lab/benchmark-privesc-linux) and publish all our findings as open-access reports.
1010

11-
How can LLMs aid or even emulate hackers? Threat actors are [already using LLMs](https://arxiv.org/abs/2307.00691), to better protect against this new threat we must learn more about LLMs' capabilities and help blue teams preparing for them.
11+
If you want to use hackingBuddyGPT and need help selecting the best LLM for your tasks, [we have a paper comparing multiple LLMs](https://arxiv.org/abs/2310.11409).
1212

13-
**[Join us](https://discord.gg/vr4PhSM8yN) / Help us, more people need to be involved in the future of LLM-assisted pen-testing:**
13+
## hackingBuddyGPT in the News
1414

15-
To ground our research in reality, we performed a comprehensive analysis into [understanding hackers' work](https://arxiv.org/abs/2308.07057). There seems to be a mismatch between some academic research and the daily work of penetration testers, please help us to create more visibility for this issue by citing this paper (if suitable and fitting).
15+
- **upcoming** 2024-11-20: [Manuel Reinsperger](https://www.github.com/neverbolt) will present hackingBuddyGPT at the [European Symposium on Security and Artificial Intelligence (ESSAI)](https://essai-conference.eu/)
16+
- 2024-07-26: The [GitHub Accelerator Showcase](https://github.blog/open-source/maintainers/github-accelerator-showcase-celebrating-our-second-cohort-and-whats-next/) features hackingBuddyGPT
17+
- 2024-07-24: [Juergen](https://github.com/citostyle) speaks at [Open Source + mezcal night @ GitHub HQ](https://lu.ma/bx120myg)
18+
- 2024-05-23: hackingBuddyGPT is part of [GitHub Accelerator 2024](https://github.blog/news-insights/company-news/2024-github-accelerator-meet-the-11-projects-shaping-open-source-ai/)
19+
- 2023-12-05: [Andreas](https://github.com/andreashappe) presented hackingBuddyGPT at FSE'23 in San Francisco ([paper](https://arxiv.org/abs/2308.00121), [video](https://2023.esec-fse.org/details/fse-2023-ideas--visions-and-reflections/9/Towards-Automated-Software-Security-Testing-Augmenting-Penetration-Testing-through-L))
20+
- 2023-09-20: [Andreas](https://github.com/andreashappe) presented preliminary results at [FIRST AI Security SIG](https://www.first.org/global/sigs/ai-security/)
21+
22+
## Original Paper
1623

1724
hackingBuddyGPT is described in [Getting pwn'd by AI: Penetration Testing with Large Language Models ](https://arxiv.org/abs/2308.00121), help us by citing it through:
1825

@@ -29,7 +36,6 @@ hackingBuddyGPT is described in [Getting pwn'd by AI: Penetration Testing with L
2936
}
3037
~~~
3138

32-
3339
## Getting help
3440

3541
If you need help or want to chat about using AI for security or education, please join our [discord server where we talk about all things AI + Offensive Security](https://discord.gg/vr4PhSM8yN)!
@@ -74,12 +80,10 @@ The following would create a new (minimal) linux privilege-escalation agent. Thr
7480
template_dir = pathlib.Path(__file__).parent
7581
template_next_cmd = Template(filename=str(template_dir / "next_cmd.txt"))
7682

77-
@use_case("minimal_linux_privesc", "Showcase Minimal Linux Priv-Escalation")
78-
@dataclass
83+
7984
class MinimalLinuxPrivesc(Agent):
8085

8186
conn: SSHConnection = None
82-
8387
_sliding_history: SlidingCliHistory = None
8488

8589
def init(self):
@@ -89,28 +93,33 @@ class MinimalLinuxPrivesc(Agent):
8993
self.add_capability(SSHTestCredential(conn=self.conn))
9094
self._template_size = self.llm.count_tokens(template_next_cmd.source)
9195

92-
def perform_round(self, turn):
93-
got_root : bool = False
96+
def perform_round(self, turn: int) -> bool:
97+
got_root: bool = False
9498

95-
with self.console.status("[bold green]Asking LLM for a new command..."):
99+
with self._log.console.status("[bold green]Asking LLM for a new command..."):
96100
# get as much history as fits into the target context size
97101
history = self._sliding_history.get_history(self.llm.context_size - llm_util.SAFETY_MARGIN - self._template_size)
98102

99103
# get the next command from the LLM
100104
answer = self.llm.get_response(template_next_cmd, capabilities=self.get_capability_block(), history=history, conn=self.conn)
101105
cmd = llm_util.cmd_output_fixer(answer.result)
102106

103-
with self.console.status("[bold green]Executing that command..."):
104-
self.console.print(Panel(answer.result, title="[bold cyan]Got command from LLM:"))
105-
result, got_root = self.get_capability(cmd.split(" ", 1)[0])(cmd)
107+
with self._log.console.status("[bold green]Executing that command..."):
108+
self._log.console.print(Panel(answer.result, title="[bold cyan]Got command from LLM:"))
109+
result, got_root = self.get_capability(cmd.split(" ", 1)[0])(cmd)
106110

107111
# log and output the command and its result
108-
self.log_db.add_log_query(self._run_id, turn, cmd, result, answer)
112+
self._log.log_db.add_log_query(self._log.run_id, turn, cmd, result, answer)
109113
self._sliding_history.add_command(cmd, result)
110-
self.console.print(Panel(result, title=f"[bold cyan]{cmd}"))
114+
self._log.console.print(Panel(result, title=f"[bold cyan]{cmd}"))
111115

112116
# if we got root, we can stop the loop
113117
return got_root
118+
119+
120+
@use_case("Showcase Minimal Linux Priv-Escalation")
121+
class MinimalLinuxPrivescUseCase(AutonomousAgentUseCase[MinimalLinuxPrivesc]):
122+
pass
114123
~~~
115124

116125
The corresponding `next_cmd.txt` template would be:
@@ -170,6 +179,9 @@ wintermute.py: error: the following arguments are required: {linux_privesc,windo
170179

171180
# start wintermute, i.e., attack the configured virtual machine
172181
$ python wintermute.py minimal_linux_privesc
182+
183+
# install dependencies for testing if you want to run the tests
184+
$ pip install .[testing]
173185
~~~
174186

175187
## Publications about hackingBuddyGPT

pyproject.toml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,14 @@ dependencies = [
2929
'requests == 2.32.0',
3030
'rich == 13.7.1',
3131
'tiktoken == 0.6.0',
32-
'instructor == 1.2.2',
32+
'instructor == 1.3.5',
3333
'PyYAML == 6.0.1',
3434
'python-dotenv == 1.0.1',
3535
'pypsexec == 0.3.0',
36+
'pydantic == 2.8.2',
3637
'openai == 1.28.0',
38+
'BeautifulSoup4',
39+
'nltk'
3740
]
3841

3942
[project.urls]
@@ -54,6 +57,11 @@ pythonpath = "src"
5457
addopts = [
5558
"--import-mode=importlib",
5659
]
60+
[project.optional-dependencies]
61+
testing = [
62+
'pytest',
63+
'pytest-mock'
64+
]
5765

5866
[project.scripts]
5967
wintermute = "hackingBuddyGPT.cli.wintermute:main"

src/hackingBuddyGPT/capabilities/http_request.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,16 @@ def __call__(self,
4141
) -> str:
4242
if body is not None and body_is_base64:
4343
body = base64.b64decode(body).decode()
44-
44+
if self.host[-1] != "/":
45+
path = "/" + path
46+
resp = self._client.request(
47+
method,
48+
self.host + path,
49+
params=query,
50+
data=body,
51+
headers=headers,
52+
allow_redirects=self.follow_redirects,
53+
)
4554
try:
4655
resp = self._client.request(
4756
method,
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
import base64
2+
from dataclasses import dataclass, field
3+
from typing import Set, Dict, Callable, Literal, Optional
4+
import inspect
5+
6+
import requests
7+
from pydantic import create_model, BaseModel
8+
9+
from . import Capability
10+
11+
12+
@dataclass
13+
class SubmitHTTPMethod(Capability):
14+
http_methods_format: str
15+
valid_http_methods: Set[str]
16+
_client = requests.Session()
17+
host: str
18+
follow_redirects: bool = False
19+
success_function: Callable[[], None] = None
20+
21+
22+
submitted_valid_http_methods: Set[str] = field(default_factory=set, init=False)
23+
24+
def describe(self) -> str:
25+
return f"Submits a HTTP methods, which is the goal of all activities. The HTTP methods format is {self.http_methods_format}. If you find a flag following the that format, that you have not yet submitted, then the most important first step is to submit it, ignoring all other possibilities of further action"
26+
27+
def to_model(self) -> BaseModel:
28+
"""
29+
Converts the parameters of the `__call__` function of the capability to a pydantic model, that can be used to
30+
interface with an LLM using eg instructor or the openAI function calling API.
31+
The model will have the same name as the capability class and will have the same fields as the `__call__`,
32+
the `__call__` method can then be accessed by calling the `execute` method of the model.
33+
"""
34+
sig = inspect.signature(self.__call__)
35+
fields = {param: (param_info.annotation, ...) for param, param_info in sig.parameters.items()}
36+
model_type = create_model(self.__class__.__name__, __doc__=self.describe(), **fields)
37+
38+
def execute(model):
39+
m = model.dict()
40+
return self(**m)
41+
42+
model_type.execute = execute
43+
44+
return model_type
45+
46+
def __call__(self, method: Literal["GET", "HEAD", "POST", "PUT", "DELETE", "OPTION", "PATCH"],
47+
path: str,
48+
query: Optional[str] = None,
49+
body: Optional[str] = None,
50+
body_is_base64: Optional[bool] = False,
51+
headers: Optional[Dict[str, str]] = None
52+
) -> str:
53+
54+
if body is not None and body_is_base64:
55+
body = base64.b64decode(body).decode()
56+
57+
resp = self._client.request(
58+
method,
59+
self.host + path,
60+
params=query,
61+
data=body,
62+
headers=headers,
63+
allow_redirects=self.follow_redirects,
64+
)
65+
try:
66+
resp.raise_for_status()
67+
except requests.exceptions.HTTPError as e:
68+
return str(e)
69+
70+
headers = "\r\n".join(f"{k}: {v}" for k, v in resp.headers.items())
71+
if len(self.submitted_valid_http_methods) == len(self.valid_http_methods):
72+
if self.success_function is not None:
73+
self.success_function()
74+
else:
75+
return "All methods submitted, congratulations"
76+
# turn the response into "plain text format" for responding to the prompt
77+
return f"HTTP/1.1 {resp.status_code} {resp.reason}\r\n{headers}\r\n\r\n{resp.text}"""
78+
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
from dataclasses import dataclass, field
2+
from typing import Tuple, List
3+
4+
import yaml
5+
6+
from . import Capability
7+
8+
@dataclass
9+
class YAMLFile(Capability):
10+
11+
def describe(self) -> str:
12+
return "Takes a Yaml file and updates it with the given information"
13+
14+
def __call__(self, yaml_str: str) -> str:
15+
"""
16+
Updates a YAML string based on provided inputs and returns the updated YAML string.
17+
18+
Args:
19+
yaml_str (str): Original YAML content in string form.
20+
updates (dict): A dictionary representing the updates to be applied.
21+
22+
Returns:
23+
str: Updated YAML content as a string.
24+
"""
25+
try:
26+
# Load the YAML content from string
27+
data = yaml.safe_load(yaml_str)
28+
29+
print(f'Updates:{yaml_str}')
30+
31+
# Apply updates from the updates dictionary
32+
#for key, value in updates.items():
33+
# if key in data:
34+
# data[key] = value
35+
# else:
36+
# print(f"Warning: Key '{key}' not found in the original data. Adding new key.")
37+
# data[key] = value
38+
#
39+
## Convert the updated dictionary back into a YAML string
40+
#updated_yaml_str = yaml.safe_dump(data, sort_keys=False)
41+
#return updated_yaml_str
42+
except yaml.YAMLError as e:
43+
print(f"Error processing YAML data: {e}")
44+
return "None"

src/hackingBuddyGPT/cli/wintermute.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,13 @@ def main():
88
parser = argparse.ArgumentParser()
99
subparser = parser.add_subparsers(required=True)
1010
for name, use_case in use_cases.items():
11-
use_case.build_parser(subparser.add_parser(
11+
subb = subparser.add_parser(
1212
name=use_case.name,
1313
help=use_case.description
14-
))
15-
16-
parsed = parser.parse_args(sys.argv[1:])
14+
)
15+
use_case.build_parser(subb)
16+
x= sys.argv[1:]
17+
parsed = parser.parse_args(x)
1718
instance = parsed.use_case(parsed)
1819
instance.init()
1920
instance.run()

src/hackingBuddyGPT/usecases/agents.py

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,33 @@
44
from rich.panel import Panel
55
from typing import Dict
66

7+
from hackingBuddyGPT.usecases.base import Logger
78
from hackingBuddyGPT.utils import llm_util
8-
99
from hackingBuddyGPT.capabilities.capability import Capability, capabilities_to_simple_text_handler
10-
from .common_patterns import RoundBasedUseCase
10+
from hackingBuddyGPT.utils.openai.openai_llm import OpenAIConnection
11+
1112

1213
@dataclass
13-
class Agent(RoundBasedUseCase, ABC):
14+
class Agent(ABC):
1415
_capabilities: Dict[str, Capability] = field(default_factory=dict)
1516
_default_capability: Capability = None
17+
_log: Logger = None
18+
19+
llm: OpenAIConnection = None
1620

1721
def init(self):
18-
super().init()
22+
pass
23+
24+
def before_run(self):
25+
pass
26+
27+
def after_run(self):
28+
pass
29+
30+
# callback
31+
@abstractmethod
32+
def perform_round(self, turn: int) -> bool:
33+
pass
1934

2035
def add_capability(self, cap: Capability, default: bool = False):
2136
self._capabilities[cap.get_name()] = cap
@@ -29,6 +44,7 @@ def get_capability_block(self) -> str:
2944
capability_descriptions, _parser = capabilities_to_simple_text_handler(self._capabilities)
3045
return "You can either\n\n" + "\n".join(f"- {description}" for description in capability_descriptions.values())
3146

47+
3248
@dataclass
3349
class AgentWorldview(ABC):
3450

@@ -40,6 +56,7 @@ def to_template(self):
4056
def update(self, capability, cmd, result):
4157
pass
4258

59+
4360
class TemplatedAgent(Agent):
4461

4562
_state: AgentWorldview = None
@@ -59,7 +76,7 @@ def set_template(self, template:str):
5976
def perform_round(self, turn:int) -> bool:
6077
got_root : bool = False
6178

62-
with self.console.status("[bold green]Asking LLM for a new command..."):
79+
with self._log.console.status("[bold green]Asking LLM for a new command..."):
6380
# TODO output/log state
6481
options = self._state.to_template()
6582
options.update({
@@ -70,16 +87,16 @@ def perform_round(self, turn:int) -> bool:
7087
answer = self.llm.get_response(self._template, **options)
7188
cmd = llm_util.cmd_output_fixer(answer.result)
7289

73-
with self.console.status("[bold green]Executing that command..."):
74-
self.console.print(Panel(answer.result, title="[bold cyan]Got command from LLM:"))
90+
with self._log.console.status("[bold green]Executing that command..."):
91+
self._log.console.print(Panel(answer.result, title="[bold cyan]Got command from LLM:"))
7592
capability = self.get_capability(cmd.split(" ", 1)[0])
7693
result, got_root = capability(cmd)
7794

7895
# log and output the command and its result
79-
self.log_db.add_log_query(self._run_id, turn, cmd, result, answer)
96+
self._log.log_db.add_log_query(self._log.run_id, turn, cmd, result, answer)
8097
self._state.update(capability, cmd, result)
8198
# TODO output/log new state
82-
self.console.print(Panel(result, title=f"[bold cyan]{cmd}"))
99+
self._log.console.print(Panel(result, title=f"[bold cyan]{cmd}"))
83100

84101
# if we got root, we can stop the loop
85102
return got_root

0 commit comments

Comments
 (0)