{"id":15288,"date":"2025-12-08T20:01:12","date_gmt":"2025-12-08T20:01:12","guid":{"rendered":"https:\/\/newestek.com\/?p=15288"},"modified":"2025-12-08T20:01:12","modified_gmt":"2025-12-08T20:01:12","slug":"apache-tika-hit-by-critical-vulnerability-thought-to-be-patched-months-ago","status":"publish","type":"post","link":"https:\/\/newestek.com\/?p=15288","title":{"rendered":"Apache Tika hit by critical vulnerability thought to be patched months ago"},"content":{"rendered":"<div>\n<div id=\"remove_no_follow\">\n<div class=\"grid grid--cols-10@md grid--cols-8@lg article-column\">\n<div class=\"col-12 col-10@md col-6@lg col-start-3@lg\">\n<div class=\"article-column__content\">\n<section class=\"wp-block-bigbite-multi-title\">\n<div class=\"container\"><\/div>\n<\/section>\n<p>A security flaw in the widely-used Apache Tika XML document extraction utility, originally made public last summer, is wider in scope and more serious than first thought, the project\u2019s maintainers have warned.<\/p>\n<p>Their <a href=\"https:\/\/lists.apache.org\/thread\/s5x3k93nhbkqzztp1olxotoyjpdlps9k\">new alert<\/a> relates to two entwined flaws, the first <a href=\"https:\/\/www.cve.org\/CVERecord?id=CVE-2025-54988\">CVE-2025-54988<\/a> from August, rated 8.4 in severity, and the second, <a href=\"https:\/\/nvd.nist.gov\/vuln\/detail\/CVE-2025-66516\">CVE-2025-66516<\/a> made public last week, rated 10.<\/p>\n<p>CVE-2025-54988 is a weakness in the tika-parser-pdf-module used to process PDFs in Apache Tika from version 1.13 to and including version 3.2.1.\u00a0 It is one module in <a href=\"https:\/\/tika.apache.org\/\">Tika\u2019s wider ecosystem<\/a> that is used to normalize data from 1,000 proprietary formats so that software tools can index and read them.<\/p>\n<p>Unfortunately, that same document processing capability makes the software a prime target for campaigns using XML External Entity (XXE) injection attacks, a recurring issue in this class of utility.<\/p>\n<p>In the case of CVE-2025-54988, this could have allowed an attacker to execute an External Entity (XXE) injection attack by hiding XML Forms Architecture (XFA) instructions inside a malicious PDF.<\/p>\n<p>Through this, \u201can attacker may be able to read sensitive data or trigger malicious requests to internal resources or third-party servers,\u201d said the CVE. Attackers could exploit the flaw to retrieve data from the tool\u2019s document processing pipeline, exfiltrating it via Tika\u2019s processing of the malicious PDF.<\/p>\n<h2 class=\"wp-block-heading\" id=\"cve-superset\">CVE superset<\/h2>\n<p>The maintainers have now realized that the XXE injection flaw is not limited to this module. It affects additional Tika components, namely Apache Tika tika-core, versions 1.13 to 3.2.1, and tika-parsers versions 1.13 to 1.28.5. In addition, legacy Tika parsers versions 1.13 to 1.28.5 are also affected.<\/p>\n<p>Unusually \u2013 and confusingly \u2013 this means there are now two CVEs for the same issue, with the second, CVE-2025-66516, a superset of the first. Presumably, the reasoning behind issuing a second CVE is that it draws attention to the fact that people who patched CVE-2025-54988 are still at risk because of the additional vulnerable components listed in CVE-2025-66516.<\/p>\n<p>So far, there\u2019s no evidence that the XXE injection weakness in these CVEs is being exploited by attackers in the wild. However, the risk is that this will quickly change should the vulnerability be reverse engineered or proofs-of-concept appear.<\/p>\n<p>CVE-2025-66516 is rated an unusual maximum 10.0 in severity, which makes patching it a priority for anyone using this software in their environment. Users should update to Tika-core version 3.2.2, tika-parser-pdf-module version 3.2.2 (standalone PDF module), or tika-parsers versions 2.0.0 if on legacy.<\/p>\n<p>However, patching will only help developers looking after applications known to be using Apache Tika. The danger is that its use might not be listed in all application configuration files, creating a blind spot whereby its use is not picked up. The only mitigation against this uncertainty would be for developers to <a href=\"https:\/\/tika.apache.org\/3.2.3\/configuring.html\">turn off the XML parsing capability<\/a> in their applications via the tika-config.xml configuration file.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A security flaw in the widely-used Apache Tika XML document extraction utility, originally made public last summer, is wider in scope and more serious than first thought, the project\u2019s maintainers have warned. Their new alert relates to two entwined flaws, the first CVE-2025-54988 from August, rated 8.4 in severity, and the second, CVE-2025-66516 made public last week, rated 10. CVE-2025-54988 is a weakness in the&#8230; <\/p>\n<p class=\"more\"><a class=\"more-link\" href=\"https:\/\/newestek.com\/?p=15288\">Read More<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-15288","post","type-post","status-publish","format-standard","hentry","category-uncategorized","is-cat-link-borders-light is-cat-link-rounded"],"_links":{"self":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15288"}],"version-history":[{"count":0,"href":"https:\/\/newestek.com\/index.php?rest_route=\/wp\/v2\/posts\/15288\/revisions"}],"wp:attachment":[{"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/newestek.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}