TikaOnDotNet 1.17.1

dotnet add package TikaOnDotNet --version 1.17.1
NuGet\Install-Package TikaOnDotNet -Version 1.17.1
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="TikaOnDotNet" Version="1.17.1" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add TikaOnDotNet --version 1.17.1
#r "nuget: TikaOnDotNet, 1.17.1"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install TikaOnDotNet as a Cake Addin
#addin nuget:?package=TikaOnDotNet&version=1.17.1

// Install TikaOnDotNet as a Cake Tool
#tool nuget:?package=TikaOnDotNet&version=1.17.1

Bare-bones IKVM Java-to-.NET port of Apache Tika. You'll want to install TikaOnDotNet.TextExtractor.

Product Compatible and additional computed target framework versions.
.NET Framework net is compatible. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
    • IKVM (>= 8.1.5717)

NuGet packages (4)

Showing the top 4 NuGet packages that depend on TikaOnDotNet:

Package Downloads
TikaOnDotnet.TextExtractor

Classes for running Apache Tika through **TikaOnDotNet**. Just use TextExtractor.Extract() and you'll be on your way.

DevelopmentHelpers.FileContentReader

This package combine many open sources packages and allow one interface to read may types of content files. for example:use open.xml to read docx file

Skybrud.Umbraco.Search.DocumentIndexer The ID prefix of this package has been reserved for one of the owners of this package by NuGet.org.

This package makes it possible to index and search a wide variety of filetypes in Umbraco, including .pdf and .docx

Jetsons.JetPack.Text

The wrapper library that provides smart extension methods to convert document formats to high quality text.

GitHub repositories (1)

Showing the top 1 popular GitHub repositories that depend on TikaOnDotNet:

Repository Stars
vivami/SauronEye
Search tool to find specific files containing specific words, i.e. files containing passwords..
Version Downloads Last updated
1.17.1 452,949 4/3/2018
1.17.0 37,604 2/15/2018
1.16.0 167,198 7/30/2017
1.15.0 13,663 7/30/2017
1.14.2 113,971 4/22/2017
1.14.2-pre 3,641 4/15/2017
1.14.1 323,269 1/13/2017
1.14.0 9,261 12/8/2016
1.13.1 11,807 8/16/2016
1.13.0 8,126 6/30/2016
1.12.2 40,264 4/12/2016
1.12.1 7,272 4/12/2016
1.12.0 8,404 4/11/2016
1.7.0 18,565 2/6/2015
1.6.4.51427 8,001 1/16/2015
1.6.3 8,615 9/27/2014
1.6.2.1 6,064 6/5/2014
1.6.0 3,955 6/5/2014
1.5.2 3,718 5/30/2014
1.5.0 4,409 3/5/2014
1.4.0.51459 5,096 7/12/2013

- Add new overloads to the `TextExtractor.Extract` allowing users to provide their own extraction result assemblers. Example:
```cs
public class CustomResult
{
public string Text { get; set; }
public IDictionary&lt;string, string[]&gt; Metadata { get; set; }
}
public static CustomResult CreateCustomResult(string text, Metadata metadata)
{
var metaDataDictionary = metadata.names().ToDictionary(name =&gt; name, metadata.getValues);
return new CustomResult
{
Metadata = metaDataDictionary,
Text = text,
};
}
[Test]
public void should_extract_author_list_from_pdf()
{
var textExtractionResult = new TextExtractor().Extract("file_with_authors.pdf", CreateCustomResult);
textExtractionResult.Metadata["meta:author"].Should().ContainInOrder("Fred Jones, M. D.", "Donald Evans D. M.");
}
```