Skip to content

Commit

Permalink
Batch lookup
Browse files Browse the repository at this point in the history
  • Loading branch information
kazeevn committed Jul 9, 2024
1 parent 721d4f5 commit 9fefccb
Show file tree
Hide file tree
Showing 4 changed files with 85 additions and 0 deletions.
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,18 @@ LANGCHAIN_PROJECT="llm-osint"

This tool is spooky good at gathering information from publicly available sources. However, it is crucial to recognize the responsibility that comes with using such a powerful tool. When utilizing it to research individuals other than yourself, always be cognizant of each person's right to privacy. Remember that personal information uncovered through open-source intelligence remains personal and should be treated with respect and protection. Use this tool ethically and responsibly, ensuring that you do not infringe upon anyone's privacy or engage in malicious activities.

### Demo vs full mode
By default the code is configured to do a fast surface search. To do a full deep dive, edit `config.yaml` and set
```yaml
deep_dive_topics: 10
deep_dive_rounds: 2
```
You may also want to use [a later model](https://platform.openai.com/docs/models) instead of GPT-3.5. Be aware that this will increase the cost of the operation 10x. For example:
```yaml
llm:
model_name: gpt-4-turbo
```
### Person Lookup
The most obvious use for something like this is to have it "google" someone and then perform an action with this information. In these examples, the author used it to research himself and took the first result. **No other additional information was given to the script beyond the command below**. For common names, disambiguation can be done like `John Smith (the Texas Musician)`.
Expand Down Expand Up @@ -266,6 +278,12 @@ Happy coding (and chewing)! 😃
</details>
### Batch lookup
Put the list of names in a file, e. g. `example_names` and use `batch_lookup.py`. For example:
```bash
python batch_lookup.py example_names.txt --ask "List 5 main associates" --n-jobs 4
```

## Prompt Architecture

### Design
Expand Down
64 changes: 64 additions & 0 deletions batch_lookup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
import argparse
import re
from pathlib import Path
from multiprocessing import Pool
from omegaconf import OmegaConf
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from person_lookup import fetch_internet_content, ASK_PROMPT


def look_person_up(
name: str,
question: str,
output_folder: Path,
config: OmegaConf) -> None:
"""
Look up a person and save the results to a file.
Args:
name: The name of the person
question: The question to ask about the person
output_folder: The folder to save the results to
config: The LLM OSINT configuration
"""
try:
content = fetch_internet_content(name, config)
file_name = re.sub(r"[^\w]", "", name).lower() + ".txt"
with open(Path("internet_content", file_name), "wt", encoding="utf-8") as f:
f.write(content)

model = ChatOpenAI(**config.llm)
result = model.invoke([HumanMessage(
ASK_PROMPT.format(
name=name, internet_content=content, question=question))]).content
with open(output_folder / file_name, "wt", encoding="utf-8") as f:
f.write(result)
print(f"Finished looking up {name}. Results:\n{result}")
print(f"Results saved to {output_folder / file_name}")
except Exception as e:
print(f"Error looking up {name}: {e}")


def main():
parser = argparse.ArgumentParser("Look up multiple people")
parser.add_argument("names_file", type=Path,
help="A file with names of people to look up, one per line")
parser.add_argument("--ask", type=str, required=True,
help="The question to ask about each person")
parser.add_argument("--output-folder", type=Path, default="batch_results",
help="The folder to save the results to")
parser.add_argument("--n-jobs", type=int, default=2,
help="The number of jobs to run in parallel")
args = parser.parse_args()
args.output_folder.mkdir(exist_ok=True)
with open(args.names_file, "rt", encoding="utf-8") as f:
names = f.read().splitlines()
config = OmegaConf.load(Path(__file__).parent / "config.yaml")
print(f"Looking up {len(names)} people...")
with Pool(args.n_jobs) as pool:
pool.starmap(
look_person_up, [(name, args.ask, args.output_folder, config) for name in names])


if __name__ == "__main__":
main()
1 change: 1 addition & 0 deletions batch_results/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.txt
2 changes: 2 additions & 0 deletions example_names.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Vladimir Lenin
Henry Ford

0 comments on commit 9fefccb

Please sign in to comment.