Analysis of Conficker data for March to May, 2009

  1. Background

    The Conficker worm appeared in November 2008, spread rapidly, and has been through a series of changes since then. The initial version, Conficker A, began on the 21 Nov 08, infecting hosts by exploiting the MS08-067 vulnerability in Microsoft Windows. Conficker B began about 29 Dec 08, and introduced some extra techniques it used to spread. Both Conficker A and B used an algorithm to generate pseudo-random domain names, and had the ability to download new code from a web server when it found one at such a domain name.

    CAIDA has observed traffic from Conficker-infected hosts using the [UCSD network telescope], and documented Conficker A and B's behaviour in [CONFICKER-AB].

    Conficker has been well documented by SRI in their technical report [SRI-CONFICKER]. Conficker C introduced a new mechanism for downloading new code, using infected hosts to form a peer-to-peer (p2p) network. It also introduced a mechanism that updates the infected host's Conficker version, and prevents any earlier versions from running. SRI observed [SRI-CONFICKER-C] that Conficker C activity began on 5 Mar 09 (UTC), and increased significantly on 17 Mar 09 (UTC).

    There is some confusion over the names of the Conficker variants; Wikipedia [WIKI-CONFICKER] lists A through E. In this article we use SRI's naming scheme; the essential difference is that where SRI refer to B++ and C, Wikipedia refer to C and D. Both use the same names for A, B and E.

    The SRI report documents all the Conficker features, and provides an algorithm that computes port numbers based on the packet's destination IP address and Unix epoch week. A packet with such a destination address is highly likely to be from a Conficker C host. Using that algorithm, we have investigated the behaviour of Conficker C packets from mid-March to early May of 2009.

  2. Hourly telescope data volumes

    Figure 1: Conficker C p2p packets and Telescope trace file sizes each hour


  3. Trace breakdown by port number, Mar 09 compared to Nov 08

    Our initial efforts to look at port numbers simply counted the number of packets seen for every possible port. That produced large postscript files and images that were hard to see patterns in. Instead, we aggregate the port numbers into five ranges as follows:

    Well-known ports 0-442, 444-1023 
    MS08-067 445[MS08-067]
    Windows XP 1024 - 4999[WINDOWS-PORTS]
    Applications 5000 - 49151 
    Ephemeral 49152 - 65535[EPHEMERAL-PORTS]

    To test SRI's algorithm for recognizing Conficker C p2p packets, here are the port breakdowns for the first hour of three days,

    In these tables, the Total column gives the number of packets seen in an hour (0000-0100), the columns on the right show the percentage of packets in each port range.

    21 Nov 08 UDP % TCP %
      Total wkp 445 xp apps eph wkp 445 xp apps eph
    Source 18M 1.76 0.00 9.58 5.19 1.21 17.72 0.00 51.38 8.36 4.81
    Dst other 18M 3.52 0.00 1.93 8.41 3.86 6.91 47.82 3.39 18.12 6.04
    Conf C p2p 668 55.84 0.00 0.00 0.15 0.30 32.93 0.00 0.00 4.79 5.99

    18 Mar 09 UDP % TCP %
      Total wkp 445 xp apps eph wkp 445 xp apps eph
    Source 61M 1.96 0.00 12.73 7.76 3.08 9.85 0.00 32.66 26.71 5.23
    Dst other 47M 3.01 0.00 1.52 9.33 1.75 12.92 29.08 4.87 27.79 9.74
    Conf C p2p 14M 0.00 0.00 1.14 39.97 16.67 0.00 0.00 0.84 29.18 12.19

    25 Apr 09 UDP % TCP %
      Total wkp 445 xp apps eph wkp 445 xp apps eph
    Source 82M 0.83 0.00 11.30 5.07 1.29 18.21 0.00 52.32 8.30 2.69
    Dst other 76M 1.46 0.00 6.55 5.93 1.40 2.95 54.32 7.62 14.73 5.03
    Conf C p2p 7M 0.04 0.00 1.07 37.08 15.49 0.02 0.00 0.91 32.03 13.37

    From 21 Nov 08 (early Conficker A) to 18 Mar 09 (after Conficker C)

    From 18 Mar 09 (after Conficker C) to 25 Apr 09 (after Conficker E)


  4. When did we begin to see Conficker C p2p packets?

    In the section above, we observed that on 21 Nov 08 (i.e. before Conficker A) we observed a few (about 0.005%) packets falsely identified as Conficker C p2p. However, by 18 Mar 09 the number of p2p packets seen in an hour had risen to 14 million. To determine when Conficker C actually began to send p2p packets, we investigated the proportion of p2p vs 'other,' and UDP vs TCP packets.

    Figure 2: The rise of Conficker C, showing proportions of p2p/other packets for TCP/UDP

    Figure 2 uses stacked bars to show the proportions of p2p vs 'other' packets, and also of TCP vs UDP packets. From 21 Dec 08 through 16 Jan 09 we were not able to record telescope data, as indicated by the orange arrow. The day SRI noted an upsurge in Conficker C activity, 17 Mar 09, is indicated by the black arrow (upper right).

    The plot shows that on and after 8 Mar 09 a significant fraction of the packets reaching the telescope were Conficker C p2p. Although we saw a tiny fraction of such packets before then, that 'false positive' fraction was insignificant. Interestingly, the fractions of UDP and UDP were similar throughout November through April.


  5. Number of Unique p2p hosts seen during 17 Mar 09

    17 Mar 09 was the day when we saw the greatest number of unique Conficker C hosts sending p2p packets. Figure 3 (below) was generated by a C program that reads through all the trace files for a day, finds the p2p packets, and builds up a hash table of their destination addresses.

    Figure 3: Unique Conficker C hosts appearing on 17 Mar 09

    In [CODE-RED] Moore, Shannon and Brown observed that some IP prefixes appeared to have high numbers of Code-Red-infected hosts. They suggested that could be due to "DHCP inflation," i.e.~a number of infected hosts might log out and return later, with DHCP giving them a different IP address when they logged in again. They also pointed out that NATs and Proxy gateways could hide a population of hosts behind a single IP address, so that a telescope would underestimate the actual number of infected hosts.

    We have not attempted to quantify either of these two effects in this study.


  6. Slide Show: Spread of Conficker C


    Figure 4: Global spread of Conficker C during 16-18 Mar 09

    This slide show was developed in JavaScript, and is based on a webmonkey JavaScript Slideshow tutorial, modified and extended to use the Unobtrusive Slider Control V2 from frequency decoder. Thanks to Sebastian Castro for the `world map' images, and to Brad Huffaker for help in making the slide show.

  7. IP Port Observations for packets reaching the telescope

    The following plots use the port groupings described in section 3 above.

    Beginning of Conficker C activity

    Figure 5: Destination port packet rates, early March 09

    March 17 was the day that Conficker C started spreading; these two plots show the start of that activity.
    For Destination ports:

    Figure 6: Source port packet rates, early March 09

    For Source ports:

    Beginning of Conficker E activity

    Figure 7: Destination port packet rates, early April 09

    April 9 was when Conficker E started [CONFICKER-E]. Conficker C doesn't send MS08-068 (port 445) packets, it only sends p2p packets. Conficker E is supposed to have started to send MS08-068 packets, as a means of infecting new hosts.
    For Destination ports:

    Figure 8: Source port packet rates, early April 09

    For Source ports: Overall, the only noticeable change is a possible rise in the volume of port 445 traffic to XP systems.

    A four-day period late in April (for comparison)

    Figure 9: Destination port packet rates, late April 09

    On 14 April we removed our rate limiting of common attack ports, particularly of port 445. These two plots provide a view of the total traffic reaching the telescope.
    For Destination ports:

    Figure 10: Source port packet rates, last April 09

    For Source ports: Overall, it's clear that packets to port 445 make up a significant fraction of all the packets we see arriving at the telescope.

  8. Packet sizes for TCP and UDP Conficker C p2p packets

    We investigated whether the packet size distribution for these two protocols had changed over time. If they had, that might provide a way to recognise the various Conficker events. Figures 11-13 (below) compare the telescope packet size distributions for TCP and UDP on four dates:

    Figure 11: TCP port size distributions, Mar - Apr 09

    Figure 12: UDP port size distributions, Mar - Apr 09

    Figure 13: UDP packet size distributions as lin-lin plots

    This plot shows the same data as above, viewed as a set of four linear-linear plots, one for each day. Viewed in this way, there are clear differences between the distributions.


  9. Conclusion

    This investigation has provided a little more information about Conficker C and it's peer-to-peer packet behaviour. I believe that the use of both UDP and TCP is a common characteristics of peer-to-peer networks, so it's not surprising that Conficker C uses both. Further investigation of Conficker C's UDP packet usage would clearly be worthwhile.

    Working with data from the telescope has been challenging because it is very different from 'normal' network trace data. Since the telescope is completely passive, it is purely one-directional (in to the telescope), so 'traffic flow' analysis methods cannot be used with it. Also, the data volumes are high - between 2 and 8 GB in a one-hour trace file - so analysis code must be carefully constructed so that it can produce useful analysis fast enough to be useful.

    Lastly, this work hinges on our ability to identify the Conficker C p2p packets. Again, we acknowledge the work of SRI on Conficker, as published in their Technical Report [SRI-CONFICKER].


  10. References

    [CODE-RED]
    "Code-Red: a case study on the spread and victims of an Internet worm",
    David Moore, Colleen Shannon, Jeffery Brown, IMW, 2002
    [CONFICKER-AB]
    "Conficker/Conflicker/Downadup as seen from the UCSD Network Telescope,"
    Emile Aben, February 2009
    [CONFICKER-E]
    "Conficker.E, Microsoft Security Response Centre, 9 Apr 09
    [EPHEMERAL-PORTS]
    "The Ephemeral Port Range," Mike Gleason, October 2004
    [HONEYPOT]
    "Know Your Enemy: Containing Conficker,"
    Felix Leder, Tillman Werner, 30 Mar 09 (rev 1)
    [INITIAL-TTL]
    "Initial TTL Values," January 2009
    [MS08-067]
    "MS08-067 Vulnerability: Botnets Reloaded," J.M. Hipolito, 25 Nov 08
    [SRI-CONFICKER]
    "An Analysis of Conficker's Logic and Rendezvous Points,"
    Phillip Porras, Hassen Saidi and Vinod Yegneswaran, SRI International, 19 Mar 09
    [SRI-CONFICKER-C]
    "Addendum: Conficker C Analysis,"
    Phillip Porras, Hassen Saidi and Vinod Yegneswaran, SRI International, 4 Apr 09
    [UCSD network telescope]
    "Network Telescope Research," CAIDA
    [WIKI-CONFICKER]
    "Conficker," 6 May 09
    [WINDOWS-PORTS]
    "The default dynamic port range for TCP/IP has changed in Windows Vista
    and in Windows Server 2008," January 2008

  11. Wikipedia
  12. Acknowledgments

    This work builds on that carried out by Emile Aben [CONFICKER-AB] in February and March 2009. My thanks to Emile, Stefan Savage, kc Cliff, Brandon Enright, Brian Kantor, Brad Huffaker, Sebastian Castro and Ryan Koga for their insightful discussions. Conficker has greatly increased the amount of data flowing from the network telescope. Managing that data has required significant amounts or technical work by Dan Anderson, Josh Polterock and Brian Kantor; I'm very grateful for all their help.


Last updated: 27 Apr 09 Nevil Brownlee