InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript

Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in in...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Dutta, Saikat, Garbervetsky, Diego, Lahiri, Shuvendu, Schafer, Max
Formato:	Objeto de conferencia Resumen
Lenguaje:	Inglés
Publicado:	2022
Materias:	Ciencias Informáticas Taint analysis Machine learning JavaScript
Acceso en línea:	http://sedici.unlp.edu.ar/handle/10915/151643 https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/305/254
Aporte de:	SEDICI (UNLP) de Universidad Nacional de La Plata

id	I19-R120-10915-151643
record_format	dspace
spelling	I19-R120-10915-1516432023-05-03T20:04:19Z http://sedici.unlp.edu.ar/handle/10915/151643 https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/305/254 issn:2451-7496 InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript Dutta, Saikat Garbervetsky, Diego Lahiri, Shuvendu Schafer, Max 2022-10 2022 2023-04-18T15:27:19Z en Ciencias Informáticas Taint analysis Machine learning JavaScript Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the tainttracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities.But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort.Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learningbased approaches have shown great promise. In this work we present our experience combining an existing machinelearning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub’s CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives. Published in: 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Sociedad Argentina de Informática e Investigación Operativa Objeto de conferencia Resumen http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 73-73
institution	Universidad Nacional de La Plata
institution_str	I-19
repository_str	R-120
collection	SEDICI (UNLP)
language	Inglés
topic	Ciencias Informáticas Taint analysis Machine learning JavaScript
spellingShingle	Ciencias Informáticas Taint analysis Machine learning JavaScript Dutta, Saikat Garbervetsky, Diego Lahiri, Shuvendu Schafer, Max InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
topic_facet	Ciencias Informáticas Taint analysis Machine learning JavaScript
description	Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the tainttracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities.But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort.Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learningbased approaches have shown great promise. In this work we present our experience combining an existing machinelearning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub’s CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives. Published in: 2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).
format	Objeto de conferencia Resumen
author	Dutta, Saikat Garbervetsky, Diego Lahiri, Shuvendu Schafer, Max
author_facet	Dutta, Saikat Garbervetsky, Diego Lahiri, Shuvendu Schafer, Max
author_sort	Dutta, Saikat
title	InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
title_short	InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
title_full	InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
title_fullStr	InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
title_full_unstemmed	InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript
title_sort	inspectjs: leveraging code similarity and user-feedback for effective taint specification inference for javascript
publishDate	2022
url	http://sedici.unlp.edu.ar/handle/10915/151643 https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/305/254
work_keys_str_mv	AT duttasaikat inspectjsleveragingcodesimilarityanduserfeedbackforeffectivetaintspecificationinferenceforjavascript AT garbervetskydiego inspectjsleveragingcodesimilarityanduserfeedbackforeffectivetaintspecificationinferenceforjavascript AT lahirishuvendu inspectjsleveragingcodesimilarityanduserfeedbackforeffectivetaintspecificationinferenceforjavascript AT schafermax inspectjsleveragingcodesimilarityanduserfeedbackforeffectivetaintspecificationinferenceforjavascript
_version_	1765659994752024576

InspectJS: leveraging code similarity and user-feedback for effective taint specification inference for JavaScript

Ejemplares similares