-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTODO
133 lines (125 loc) · 6.51 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
- Add an ORCONN_BW event to Tor to emit read/write info and also queue sizes
- See tordiffs/orconn-bw.diff but it probably should be a separate event,
not hacked onto ORCONN
- Use nodemon.py to rank nodes based on total bytes, queue sizes, and the
ratio of these two
- Does it agree with results from metatroller's bandwidth stats?
- More NodeRestrictions/PathRestrictions in TorCtl/PathSupport.py
- BwWeightedGenerator
- NodeRestrictions:
- Uptime/LongLivedPorts (Does/should hibernation count?)
- Published/Updated
- GeoIP (http://www.maxmind.com/app/python)
- NodeCountry
- PathRestrictions:
- Family
- GeoIP (http://www.maxmind.com/app/python)
- OceanPhobicRestrictor (avoids Pacific Ocean or two atlantic crossings)
or ContinentRestrictor (avoids doing more than N continent crossings)
- OceanPhilicRestrictor or ContinentJumperRestrictor
- Can be used as counterpoint to see how bad for performance it is
- EchelonPhobicRestrictor
- Does not cross international boundaries for client->Entry or
Exit->destination hops
- Perform statistical analysis on paths
- How often does Tor choose foolish paths normally?
- (4 atlantic/pacific crossings)
- Use speedracer to determine how much slower these paths relly are
- What is the distribution for Pr(ClientLocation|MiddleNode,ExitNode)
and Pr(EntryNode|MiddleNode,ExitNode) for these various path choices?
- Mathematical analysis probably required because this is a large joint
distribution (not GSoC)
- Empirical observation possible if you limit to the top 10% of the
nodes (which carry something like 90% of bandwidth anyways).
- Make few million paths without actually building real
circuits and tally them up in a 3D table
- See PathSupport.py unit tests for some examples on this
- See also:
http://swiki.cc.gatech.edu:8080/ugResearch/uploads/7/ImprovingTor.pdf
- You can also perform predecessor observation of this strategy
empirically. But it is likely the GeoIP stuff is easier to implement
and just as effective.
- Create a PathWatcher that StatsHandler can extend from so people can gather
stats from regular Tor usage
- Use GeoIP to make a map of tor servers color coded by their reliability
- Or augment an existing Tor map project with this data
- Add circuit prebuilding and port history learning for keeping an optimal
pool of circuits available for use
- Build circuits in parallel to speed up scanning
- Rewrite soat.pl in python
- Improve SSL cert handling/verification. openssl client is broken.
- The way we store certs is lame. No need to store so many copies
for diff IPs if they are all the same.
- Also verify STARTTLS is not molested on smtp, pop and imap ports
- Means need to make sure openssl lib supports STARTTLS
- Report failing nodes via SETCONF AuthDirBadExit
to potentially alternate control port than used by metatroller
- dynamic content scanning
- tag structure fingerprinting
- Optionally use same origin policy for dynamic content checks
- Anything in same origin should not change?
- filter out dynamic tags with multiple fetches outside of Tor?
- Or just target specific tags and verify their content
doesn't change
- css, script, and object tags and tags that can contain script
(there are a LOT of these, but we'd only need to check
their attributes)
- Perhaps "double check" to see if a document has changed
outside of tor after a failure through tor
- GeoIP-based exit node grouping to reduce geo-location false positives?
- make sure all http headers match a real browser
- DNS rebind attack scan
- http://christ1an.blogspot.com/2007/07/dns-pinning-explained.html
- Basically we want to make sure that no exit nodes resolve arbitrary
domains to internal IP addresses
- http://www.faqs.org/rfcs/rfc1918.html
- This could be done with periodic calls to
"getinfo address-mappings/cache" during scanning, or by
changing metatroller to inspect STREAM NEWRESOLVE/REMAP events
- Improve checking of changes to documents outside of Tor
- Make a multilingual keyword list of commonly censored terms to google for
using this scanner
- Check Exit policy for sketchyness. Mark BadExit if they allow:
- pop but not pops
- imap not but imaps
- telnet but not ssh
- smtp but not smtps
- http but not https
- This also means we have to verify encrypted ports actually work and
all exits will honor connections through them (in addition to
checkign certs)
- Support multiple scanners in metatroller
- Improve interaction between soat+metatroller so soat knows
which exit was responsible for a given ip/url
- SYN+Reverse DNS resolve scan
- This can detect exit sniffers that reverse resolve IPs. However,
it is high-effort (requires someone to run reverse DNS for us),
and requires keeping their IP range secret.
- Design Reputation System
- Emit some kind of penalty multiplier based on circuit/stream failure rate
and the ratio of directory "observed" bandwidth vs avg stream bandwidth
- Add keyword to directory for clients to use instead of observed
bandwidth for routing decisions
- Make sure scanners don't listen to this keyword to avoid "Creeping
Death"
- Queue lengths from the node monitor can also figure into this penalty
multiplier
- Figure out interface to report this and also BadExit determinations
- Probably involves voting among many scanners
- Justify this is worthwhile, sane, and at least as resistant as the current
Tor network to attack
- Does a reputation system make it easier for an adversary w/ X% of the
network to influence it?
- Preliminary: http://archives.seul.org/or/dev/Nov-2006/msg00004.html
- Sybil attacks
- What about clients that ignore the reputations? Can their behavior game
the system, or are they just behaving suboptimally?
- First impressions: meh; suboptimal
- Does changings in ratings leak any information about clients?
- Does it influence their paths in predictable ways in a greater degree
than bandwidth ranking already does?
- What about detecting the scan and giving better service? Time of day,
source IP, exit IP?
- Stopgap for bootstrapping
- push traffic through the 0.1.1.x with 0 dirport and earlier servers
that claim less than 20KB traffic