{"id":6,"date":"2016-02-29T11:33:14","date_gmt":"2016-02-29T11:33:14","guid":{"rendered":"http:\/\/wp.cs.ucl.ac.uk\/semamatch\/?page_id=6"},"modified":"2016-02-29T11:51:05","modified_gmt":"2016-02-29T11:51:05","slug":"about","status":"publish","type":"page","link":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/about\/","title":{"rendered":"SeMaMatch"},"content":{"rendered":"<p style=\"text-align: justify\">The flood of malware samples is predicted to grow into a deluge in\u00a02012, making the problem of maintaining a database of malware\u00a0signatures ever more difficult. For each new sample, it is important\u00a0to determine the threat that it poses.\u00a0In response to this, dynamic malware analysis\u00a0tools have been designed that execute the sample in a sandbox,\u00a0monitoring the actions of a sample. If these actions are similar\u00a0to those of malware that has been already indexed in the database,\u00a0then one might draw conclusions regarding provenance and severity\u00a0of the threat posed. If the sample does not match against known\u00a0malware, then it can be subject to manual scrutiny, using a dissembler\u00a0such as IDA Pro.<\/p>\n<p style=\"text-align: justify\">This Linnaean approach to malware analysis is both natural and\u00a0convenient: it is natural to group malware into families that share\u00a0common attributes; and it is provides a convenient way of assessing\u00a0threat. Yet the whole methodology is predicated on the accuracy\u00a0with which samples are characterised by their signatures. If a\u00a0sample is assigned a signature that does not express its behaviour,\u00a0then samples that are behaviourally distinct can be erroneously\u00a0grouped together. Conversely, samples which behave the same, but\u00a0appear different, can be accidentally placed in different groups.\u00a0The main problem with dynamic malware analysis tools is that they\u00a0execute the binary for a limited time, typically considering just\u00a0one path through the binary. This limits the actions that can be\u00a0observed, rendering the signature inaccurate for programs that\u00a0reveal their true behaviour later. In addition, the dynamic approach\u00a0can miss infrequent actions or logic bombs. The dynamic approach is\u00a0also susceptible to timing attacks that detect a tracer to turn off\u00a0some action. Above all, the signatures are based solely and only\u00a0on those actions that are encountered during the trace.\u00a0More static approaches have been applied too, at one extreme using\u00a0the call graph of the binary itself for classification, and at the\u00a0other deploying model checking techniques to search the paths through\u00a0call graph for signature behaviours that characterise known malware\u00a0families. Yet graph matching techniques are sensitive to control-flow\u00a0obfuscation and model checking requires the signature behaviours\u00a0to be known up-front and distilled into a temporal formula or an\u00a0automata.<\/p>\n<p style=\"text-align: justify\">A middle ground is offered by abstract interpretation since it\u00a0provides a way to systematically consider all paths, while monitoring\u00a0a program for actions that inform the construction of the signature.\u00a0Abstract interpretation provides a way to break the dichotomy between\u00a0the purely dynamic and the purely static approach to malware analysis\u00a0into a graduated continuum. Formally, purely static approach\u00a0(a.k.a. a static analysis) can be derived from the purely dynamic\u00a0approach (a.k.a. a tracer) by compositing a sequence of abstractions:\u00a0if all n abstractions are applied the result is the static analysis;\u00a0but if the first m &lt; n abstractions are applied the result is a\u00a0hybrid. The challenge is to find the hybrid that provides sufficient\u00a0path coverage to undercover logic bombs yet is sufficiently robust\u00a0to be used by practitioners in the security sector. The proposed\u00a0project will discover this sweet point by following two complementary\u00a0lines of inquiry. Concrete traces will be abstracted to cover more\u00a0paths and more actions (at UCL). Static analyses, which covers all\u00a0paths, will be refined to avoid paths and actions that do not\u00a0actually occur (at Kent). Thus UCL will add missing information\u00a0to signatures (converging on the ideal signature from below) whilst\u00a0Kent will remove excess information from signatures (converging on\u00a0the ideal signature from above). By reflecting on the relative\u00a0merits of these approaches, we will draw conclusions on how to\u00a0construct robust signatures for malware classification and thereby\u00a0advance the whole field.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The flood of malware samples is predicted to grow into a deluge in\u00a02012, making the problem of maintaining a database of malware\u00a0signatures ever more difficult. For each new sample, it is important\u00a0to determine the threat that it poses.\u00a0In response to this, dynamic malware analysis\u00a0tools have been designed that execute the sample in a sandbox,\u00a0monitoring the &hellip; <a href=\"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/about\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">SeMaMatch<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":93,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/pages\/6"}],"collection":[{"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/users\/93"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/comments?post=6"}],"version-history":[{"count":0,"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/pages\/6\/revisions"}],"wp:attachment":[{"href":"https:\/\/wp.cs.ucl.ac.uk\/semamatch\/wp-json\/wp\/v2\/media?parent=6"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}