ׯÏÐÓÎÏ·

֤ȯ¼ò³Æ£º×¯ÏÐÓÎÏ· ֤ȯ´úÂ룺002212
È«Ììºò7x24СʱЧÀÍ£º 400-777-0777
Çå¾²ÔÆÐ§ÀÍ

È˹¤ÖÇÄÜÇå¾²|AIÇå¾²Ó¦Ó㺻ùÓÚ´úÂëÓïÒåµÄ¶ñÒâ´úÂëͬԴÆÊÎö

ÎÒÃÇÔÚǰÎÄ[1]ÖÐÖØµãÏÈÈÝÁË»ùÓÚͼÏñ·ÖÀàµÄ¶ñÒâ´úÂëͬԴÆÊÎöÒªÁì £¬£¬£¬¸ÃÒªÁìʵÖÊÉÏÊÇÆ¾Ö¤¶ñÒâ´úÂë×Ö½ÚÔ¼ÄÚÈݵÄÌØÕ÷¾ÙÐзÖÀà¡£¡£¡£¡£ ¡£È»¶ø £¬£¬£¬ÕâÖÖÒªÁì´ÓÄæÏò¹¤³ÌµÄ½Ç¶ÈÀ´¿´²»¾ßÓпÉÚ¹ÊÍÐÔ¡£¡£¡£¡£ ¡£

È˹¤ÖÇÄÜÇå¾²|AIÇå¾²Ó¦Ó㺻ùÓÚ´úÂëÓïÒåµÄ¶ñÒâ´úÂëͬԴÆÊÎö

Ðû²¼Ê±¼ä£º2021-11-01
ä¯ÀÀ´ÎÊý£º4026
·ÖÏí£º

1.СÐò

ÎÒÃÇÔÚǰÎÄÖÐÖØµãÏÈÈÝÁË»ùÓÚͼÏñ·ÖÀàµÄ¶ñÒâ´úÂëͬԴÆÊÎöÒªÁì £¬£¬£¬¸ÃÒªÁìʵÖÊÉÏÊÇÆ¾Ö¤¶ñÒâ´úÂë×Ö½ÚÔ¼ÄÚÈݵÄÌØÕ÷¾ÙÐзÖÀà¡£¡£¡£¡£ ¡£È»¶ø £¬£¬£¬ÕâÖÖÒªÁì´ÓÄæÏò¹¤³ÌµÄ½Ç¶ÈÀ´¿´²»¾ßÓпÉÚ¹ÊÍÐÔ¡£¡£¡£¡£ ¡£

ÖÚËùÖÜÖª £¬£¬£¬»ã±à´úÂë¾ßÓнÏΪÏÊÃ÷µÄÓï·¨¿É¶ÁÐÔ¡£¡£¡£¡£ ¡£ÈôÊÇÏȰѶñÒâ´úÂë¾ÙÐз´»ã±à £¬£¬£¬È»ºóÓÃ×ÔÈ»ÓïÑÔ´¦Öóͷ££¨Natural Language Processing£©ÊÖÒÕÌáÈ¡´úÂëÓïÒåÌØÕ÷ £¬£¬£¬ÔÙ¾ÙÐÐͬԴÆÊÎö £¬£¬£¬ÕâÑùµÄÒªÁì¾ÍÈÝÒ×Ú¹ÊÍ £¬£¬£¬Õâ¾ÍÊDZ¾ÎĽ«ÏÈÈݵĻùÓÚ´úÂëÓïÒåµÄͬԴÆÊÎöÒªÁì¡£¡£¡£¡£ ¡£Ä¿½ñ £¬£¬£¬ÕâÖÖÒªÁì²»µ«±»ÓÃÓÚ¶ñÒâ´úÂë¼ì²âÁìÓò £¬£¬£¬»¹±»ÓÃÔÚ´úÂë¿Ë¡ËÑË÷¡¢´úÂëÇÖȨÅжϵÈÁìÓò¡£¡£¡£¡£ ¡£

±¾ÎÄÊ×ÏÈÏÈÈÝÁË»ùÓÚ´úÂëÓïÒåͬԴÆÊÎöµÄ»ù´¡ÖªÊ¶£»£»£»Æä´ÎÏÈÈÝÁË»ùÓÚ´úÂëÓïÒåµÄͬԴÆÊÎöÏà¹ØÊÂÇ飻£»£»×îºó £¬£¬£¬¸ø³öÁË»ùÓÚ´úÂëÓïÒåµÄͬԴÆÊÎöÊÖÒռƻ®Éè¼Æ £¬£¬£¬²¢Í¨¹ýʵÑéÑéÖ¤Á˼ƻ®µÄÓÐÓÃÐÔ¡£¡£¡£¡£ ¡£

2.»ù´¡ÖªÊ¶

»ùÓÚ´úÂëÓïÒåµÄ¶ñÒâ´úÂëͬԴÆÊÎöµÄ»ù´¡ÊÇÓïÒåÌáÈ¡¡£¡£¡£¡£ ¡£PV-DMºÍTextCNNÊÇNLPÁìÓòÓйشúÂëÓïÒåÌáÈ¡µÄÁ½ÖÖ³£¼ûµÄÄ£×Ó, ˵Ã÷ÈçÏ£º

(1)¾äÏòÁ¿µÄÂþÑÜʽӰÏóÄ£×Ó£¨Distributed Memory Model of Paragraph Vectors £¬£¬£¬PV-DM£©

ÔÚPV-DMÄ£×ÓÖÐ £¬£¬£¬´ÊÏòÁ¿ºÍ¾äÏòÁ¿ÏàÆ´½Ó £¬£¬£¬ÓÃÀ´Õ¹ÍûÎı¾ÖеÄÏÂÒ»¸ö´Ê £¬£¬£¬Í¨¹ýÔÚ¾ä×ÓÉϵĴ°¿Ú»¬¶¯ £¬£¬£¬Ê¹¾äÏòÁ¿Ó°Ïó¾ä×ÓÖÐËùÓдʵÄÉÏÏÂÎĹØÏµ¡£¡£¡£¡£ ¡£ÔÚ´úÂëÓïÒåÌáÈ¡ÖÐʹÓÃPV-DMÄ£×Ó £¬£¬£¬ÄܼòÆÓÓÐÓõؽâ¾öÏòÁ¿³¤¶È·×ÆçÖÂÎÊÌ⣨ͼ1£©.

ͼ1 PV-DMÄ£×Ó

(2)TextCNNÄ£×Ó

TextCNNͨ¹ýÆ´½Ó´ÊÏòÁ¿½«Îı¾×ª»¯³É¾ØÕó £¬£¬£¬È»ºóÓ¦Óþí»ýÉñ¾­ÍøÂçʩչÉî¶ÈѧϰµÄÓÅÊÆ¡£¡£¡£¡£ ¡£Ïà±ÈÓÚÒ»Ñùƽ³£µÄ¾í»ýÉñ¾­ÍøÂçÄ£×Ó £¬£¬£¬TextCNNÔÚ¾í»ý²ãÖÐÓ¦Óöà¸ö²î±ð³ß´çµÄ¾í»ýºË£¨Í¼2£©¡£¡£¡£¡£ ¡£TextCNN¾ßÓÐÍøÂç½á¹¹¼òÆÓ¡¢Ñ·üçٶȿ첢ÇÒЧ¹û½ÏºÃµÈÓŵ㡣¡£¡£¡£ ¡£¿ÉÊÇ £¬£¬£¬ÔÚǶÈë²ãÖнÓÄÉԤѵÁ·µÄ´ÊÏòÁ¿Ä£×Ó£¨ÈçWord2Vec£©¾ÙÐÐÓïÒåÌáÈ¡ £¬£¬£¬Òò¶ø»áÓг¤¶È·×ÆçÖµÄÎÊÌâ¡£¡£¡£¡£ ¡£

ͼ2 TextCNNÄ£×Ó

3.Ïà¹ØÊÂÇé

ZhangµÈ[2]Î§ÈÆÀÕË÷Èí¼þµÄ¼Ò×å·ÖÀàÎÊÌâ £¬£¬£¬Ìá³öÒ»ÖÖÌØÕ÷ÌáȡҪÁì £¬£¬£¬¸ÃÒªÁ콫Ñù±¾Ö¸ÁîÐòÁÐת»»Îª²î±ðnֵʱµÄn-gramÜöÝÍ £¬£¬£¬ÅÌËãÿ¸ön-gramµÄTF-IDF£¨term frequency¨Cinverse document frequency£©²¢Ñ¡Ôñ¼Ò×åÖÐTF-IDFÖµ½Ï¸ßµÄt¸ön-gram×÷ÎªÌØÕ÷¡£¡£¡£¡£ ¡£È»¶ø £¬£¬£¬n-gramÌØÕ÷½ö½ö·´Ó¦ÐòÁл¯ÌØÕ÷ £¬£¬£¬²»¿ÉÌáÈ¡´úÂëÎı¾µÄÓïÒåÐÅÏ¢¡£¡£¡£¡£ ¡£

³ÂµÈÌá³öÒ»ÖÖ»ùÓÚ´úÂëÓïÒåµÄ¶ñÒâ´úÂëͬԴÅжÏÒªÁì[3] £¬£¬£¬Ê¹ÓÃWord2Vec»ñȡָÁîµÄ´ÊÏòÁ¿ £¬£¬£¬²¢Ê¹ÓÃTextCNN¾ÙÐзÖÀà¡£¡£¡£¡£ ¡£FangµÈÈËÔò½ÓÄÉÁËFastTextÄ£×ÓÌáÈ¡JavaScript´úÂëµÄ´ÊÏòÁ¿[4] £¬£¬£¬FastText½«¶à¸öµ¥´Ê¼°Æän-gram×÷ΪÊäÈë £¬£¬£¬Ö±½ÓÊä³öÄ£×ÓÅжϵÄÖֱ𡣡£¡£¡£ ¡£

DingµÈÌá³öÒ»ÖÖ»ã±à´úÂëµÄÓïÒåÄ£×Ó-Asm2Vec[5],ÓÃÓÚÌáȡָÁî´úÂëµÄÓïÒåÐÅÏ¢¡£¡£¡£¡£ ¡£¸ÃÒªÁì»ùÓÚ¾äÏòÁ¿µÄÂþÑÜʽӰÏóÄ£×ÓPV-DMÉè¼Æ £¬£¬£¬²¢Ë¼Á¿ÁË»ã±à´úÂëÃûÌõÄ˳ӦÐÔÎÊÌâ¡£¡£¡£¡£ ¡£ÓÉÓÚ¿ØÖÆÁ÷³ÌͼÄÜÔÚÒ»¶¨Ë®Æ½ÉÏ·´Ó¦´úÂëµÄ¶¯Ì¬Ë³ÐòÐÅÏ¢ £¬£¬£¬Ò»Ð©Ñо¿ÊÂÇéÏȹ¹½¨´úÂëµÄ¿ØÖÆÁ÷³Ìͼ £¬£¬£¬ÔÙʹÓÃͼƥÅ䡢ͼÉñ¾­ÍøÂ磨Graph Neural Network £¬£¬£¬GNN£©µÈÊÖÒÕÆÀ¹À´úÂëÏàËÆÐÔ¡£¡£¡£¡£ ¡£GNNËäÈ»ÐÔÄܱȹŰåµÄͼƥÅä¸üºÃ £¬£¬£¬µ«ÔÚÓïÒåѧϰÉÏÈÔÓÐȱ·¦¡£¡£¡£¡£ ¡£Îª´Ë £¬£¬£¬YuµÈÌá³öÒ»ÖÖͬʱ²¶»ñ´úÂëµÄÓïÒå¡¢½á¹¹ÒÔ¼°Ë³ÐòµÄÒªÁì[6] £¬£¬£¬Ê¹ÓÃBertÄ£×Ó¾ÙÐÐÕ¹ÍûѵÁ·ÒÔ»ñÈ¡ÓïÒåÐÅÏ¢ £¬£¬£¬Ê¹ÓÃÐÂÎÅת´ïÉñ¾­ÍøÂ磨Message Passing NeuralNetwork £¬£¬£¬MPNN£©»ñÈ¡½á¹¹ÐÅÏ¢ £¬£¬£¬Ê¹ÓÃResnetÄ£×ÓÌáȡ˳ÐòÐÅÏ¢¡£¡£¡£¡£ ¡£

4.¼Æ»®Éè¼Æ

»ùÓÚ´úÂëÓïÒåµÄͬԴÆÊÎö¼Æ»®Ö÷ÒªÓÉÓïÒåÌØÕ÷ÌáÈ¡ºÍͬԴ·ÖÀàѵÁ·Á½´ó²¿·Ö×é³É¡£¡£¡£¡£ ¡£Ïêϸ´¦Öóͷ£Á÷³ÌÉÏ £¬£¬£¬Ö÷Òª°üÀ¨ÁËÈçϰ취£¨Í¼3£©

µÚÒ»²½£ºÊý¾Ý×¼±¸¡£¡£¡£¡£ ¡£ÍøÂçÑù±¾²¢±ê×¢Öֱ𠣬£¬£¬¹¹½¨ÑµÁ·Êý¾Ý¼¯£»£»£»

µÚ¶þ²½£º·´»ã±à¡£¡£¡£¡£ ¡£¶Ô¿ÉÒÆÖ²¿ÉÖ´ÐеĶñÒâ´úÂëÎļþ¾ÙÐз´»ã±à £¬£¬£¬»ñµÃ»ã±à´úÂ룻£»£»

µÚÈý²½£ºÔ¤´¦Öóͷ£¡£¡£¡£¡£ ¡£Ê¹ÓÃNLPÊÖÒÕ¶Ô»ã±à¾ÙÐзִʡ¢Òªº¦´ÊɸѡµÈÔ¤´¦Öóͷ££»£»£»

µÚËIJ½£ºÓïÒåÌáÈ¡¡£¡£¡£¡£ ¡£¹¹½¨ÓïÒåÄ£×Ó £¬£¬£¬Ê¹ÓÃѵÁ·Êý¾Ý¾ÙÐÐѵÁ· £¬£¬£¬²¢ÌáÈ¡³öÿ¸öÑù±¾µÄÓïÒåÌØÕ÷¡£¡£¡£¡£ ¡£±¾ÎÄʹÓÃÁËPV-DMÒÔ¼°TextCNNÖеÄWord2Vec×÷ΪÓïÒåÌáȡģ×Ó¡£¡£¡£¡£ ¡£

µÚÎå²½£ºÍ¬Ô´·ÖÀà¡£¡£¡£¡£ ¡£Æ¾Ö¤ÓïÒåÌØÕ÷ £¬£¬£¬½ÓÄÉÏàËÆÐÔ»³±§»ò¾ÛÀà/·ÖÀàËã·¨ÆÊÎöͬԴÐÔ¡£¡£¡£¡£ ¡£±¾ÎÄʹÓÃÁËDNN¡¢KMeans¾ÛÀà¡¢CNNµÈÊÖÒÕ¡£¡£¡£¡£ ¡£

ͼ3 »ùÓÚ´úÂëÓïÒåµÄͬԴÆÊÎöÁ÷³Ì

5.ʵÑéÆÊÎö

±¾½Úͨ¹ýʵÑéÑéÖ¤Á½ÖÖ»ùÓÚ´úÂëÓïÒåÄ£×ÓµÄͬԴÆÊÎöÒªÁì¡£¡£¡£¡£ ¡£ÊµÑéËùÓÃÑùԭȪԴÓÚÍøÂç £¬£¬£¬°üÀ¨Application¡¢Backdoor¡¢Generic¡¢Trojan¡¢Variant¡¢Virus¼°WormµÈÖֱ𣨱í1£©¡£¡£¡£¡£ ¡£

±í1. ʵÑéÊý¾Ý¼¯

ʵÑéÒ»£º»ùÓÚPV-DMÄ£×ÓµÄͬԴÆÊÎö

ͼ4ΪPV-DMÓïÒåÄ£×ÓµÄѵÁ·Àú³Ì¡£¡£¡£¡£ ¡£ÌáÈ¡³ö256άµÄÓïÒåÏòÁ¿ £¬£¬£¬Ó¦ÓÃÉñ¾­ÍøÂç¾ÙÐзÖÀà £¬£¬£¬Æ¾Ö¤±ÈÀý4£º1»®·ÖѵÁ·¼¯ºÍ²âÊÔ¼¯ £¬£¬£¬×ÜÌå׼ȷÂÊΪ0.74¡£¡£¡£¡£ ¡£ÁíÍâ £¬£¬£¬¶ÔÌáÈ¡µÄÓïÒåÌØÕ÷½ÓÄÉKMeansËã·¨¾ÙÐÐÁ˾ÛÀà £¬£¬£¬²âÊÔ׼ȷÂÊͬÑùÊÇ0.74¡£¡£¡£¡£ ¡£

ͼ4 »ùÓÚ PV-DMµÄDNNÄ£×ÓѵÁ·¼°²âÊÔ

ͼ5 »ùÓÚPV-DMµÄKMeans¾ÛÀࣨAccuracy=0.74£©

ʵÑé¶þ£º»ùÓÚTextCNNµÄͬԴÆÊÎö

ͼ6ΪÑù±¾ÖÐÖ¸ÁîÊýÄ¿µÄͳ¼Æ £¬£¬£¬Æ½¾ùÖ¸ÁîÊýĿΪ28 £¬£¬£¬×îСΪ1£¨195¸öÑù±¾£© £¬£¬£¬×î´óΪ74£¨1¸öÑù±¾£©¡£¡£¡£¡£ ¡£¹¹½¨TextCNNÄ£×Ó £¬£¬£¬ÉèÖòî±ð¾ÞϸµÄһά¾í»ýºË £¬£¬£¬½«ÌØÕ÷ͼ×î´ó³Ø»¯²¢Æ´½Ó £¬£¬£¬½«Êý¾Ý¼¯Æ¾Ö¤±ÈÀý4£º1»®·ÖΪѵÁ·¼¯ºÍÑéÖ¤¼¯ £¬£¬£¬Èçͼ7Ëùʾ £¬£¬£¬²âÊÔ׼ȷÂÊΪ0.65×óÓÒ¡£¡£¡£¡£ ¡£

ͼ6 Ö¸ÁîÊýĿͳ¼Æ

ͼ7 TextCNNѵÁ·¼°²âÊÔ

6.×ܽá

±¾ÎÄͨ¹ýʵÑé֤ʵÎú»ùÓÚ´úÂëÓïÒåµÄ¶ñÒâ´úÂëͬԴÆÊÎöÒªÁì¾ß±¸Ò»¶¨µÄ¿ÉÐÐÐÔ¡£¡£¡£¡£ ¡£È»¶ø £¬£¬£¬PV-DM¡¢TextCNNÒªÁìÖ±½ÓÓ¦ÓÃÓÚÌáÈ¡»ã±à´úÂëÓïÒåʱ £¬£¬£¬ÍêÈ«½«»ã±à´úÂëÀà±È³É´¿Îı¾ £¬£¬£¬ÓïÒåÌáÈ¡µÄ׼ȷÐÔÂԵ͡£¡£¡£¡£ ¡£ÎÄÏ×[5]ÊÇÕë¶Ô»ã±à´úÂë¶øÉè¼ÆµÄÓïÒåÌáȡҪÁì £¬£¬£¬Äܹ»Ô½·¢×¼È·µØÌáÈ¡ÓïÒåÐÅÏ¢ £¬£¬£¬ºóÐø½«Î§ÈÆ´ËÒªÁì×÷½øÒ»²½Ñо¿¡£¡£¡£¡£ ¡£

²Î¿¼ÎÄÏ×

[1]ÖÇÄÜÇå¾²Ñо¿×é È˹¤ÖÇÄÜÇå¾²|AIÇå¾²Ó¦ÓÃ|»ùÓÚͼÏñ·ÖÀàµÄͬԴÆÊÎö. 2021.10.15

[2]Hanqi Zhang, Xi Xiao.Classification of ransome families with machine learning based on N-gram ofopcodes[J]. Future generation computer system, 2019(90):211-221.

[3]³Âº­²´ £¬£¬£¬ÎâÔ½ £¬£¬£¬×Þ¸£Ì© . »ùÓÚ Asm2Vec µÄ¶ñÒâ´úÂëͬԴÅжÏÒªÁì [J]. ͨѶÊÖÒÕ ,2019,52(12):3010-3015.

[4]Yong Fang, Cheng Huang.Detecting malicious JavaScript code based on semantic analysis[J].Computer&Security, 2020(93):1-9.

[5]Steven H H Ding, Benjamin C MFung. Asm2Vec: Boosting Static Representation Robustness for Binary CloneSearch against Code Obfuscation and Compiler Optimization[C]. S&P,2019:1-18.

[6]Zeping Yu, Rui Cao, Qiyi Tang,et al. Order Matters£ºSemantic-Aware Neural Networks forBinary Code Similarity Detection[C]. AAAI, 2020:1-8.

°æÈ¨ÉùÃ÷

×ªÔØÇëÎñ±Ø×¢Ã÷À´ÓÉ¡£¡£¡£¡£ ¡£

°æÈ¨ËùÓÐ £¬£¬£¬Î¥Õ߱ؾ¿¡£¡£¡£¡£ ¡£

Òªº¦´Ê±êÇ©£º
ׯÏÐÓÎÏ· È˹¤ÖÇÄÜÇå¾² AIÇå¾²Ó¦ÓÃ
¿Í»§Ð§ÀÍÈÈÏß

400-777-0777
7*24СʱЧÀÍ

ÁªÏµÓÊÏä

servicing@topsec.com.cn

ɨÂë¹Ø×¢
¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿