Earl.Crawler.Middleware
0.0.0-alpha.0.111
dotnet add package Earl.Crawler.Middleware --version 0.0.0-alpha.0.111
NuGet\Install-Package Earl.Crawler.Middleware -Version 0.0.0-alpha.0.111
<PackageReference Include="Earl.Crawler.Middleware" Version="0.0.0-alpha.0.111" />
paket add Earl.Crawler.Middleware --version 0.0.0-alpha.0.111
#r "nuget: Earl.Crawler.Middleware, 0.0.0-alpha.0.111"
// Install Earl.Crawler.Middleware as a Cake Addin #addin nuget:?package=Earl.Crawler.Middleware&version=0.0.0-alpha.0.111&prerelease // Install Earl.Crawler.Middleware as a Cake Tool #tool nuget:?package=Earl.Crawler.Middleware&version=0.0.0-alpha.0.111&prerelease
Earl Middleware Layer
The "Earl Middleware Layer" refers to a suite of APIs that enable the composition of code into a series of operations performed against a url during a crawl.
Earl's Middleware pattern is strongly inlfuenced by ASP.NET Core's Middleware pattern, it is strongly recommended to review ASP.NET Core's Middleware documentation
Middleware accepts a CrawlUrlResult
and a CrawlUrlDelegate
, the former of which represents the current state of the crawl against the current url; the latter being a reference to the next operation in the pipeline. This behaviour is captured in the ICrawlerMiddleware
and is analgous to ASP.NET Core's IMiddleware
contract.
Middleware is configured for a crawl using the Use
extension methods, which allow 3 means of implementing middlware:
- Typed Middleware
- Typed Middleware with Options
- Delegate Middleware
Typed Middleware
"Typed Middleware" refers to a class that implements the ICrawlerMiddleware
contract, for example:
public class CustomMiddleware : ICrawlerMiddleware
{
public Task InvokeAsync( CrawlUrlContext context, CrawlUrlDelegate next )
{
Console.WriteLine( $"Executing typed middleware while crawling {context.Url}" );
return next( context );
}
}
// ...
var options = CrawlerOptionsBuilder.CreateDefault()
.Use<CustomMiddleware>()
.Build();
await crawler.CrawlAsync( new Uri(...), options );
Typed Middleware with Options
If you wish to allow consumers of Middleware to specify an object to configure the functionality of the Middleware, the ICrawlerMiddleware<TOptions>
contract may be used.
When using the ICrawlerMiddleware<TOptions>
contract, specify a constructor dependency on an instance of TOptions
, and invoke the Use<TMiddleware, TOptions>( this ICrawlerOptionsBuilder builder, TOptions options )
extension method to configure the desired TOptions
for a crawl:
public record CustomMiddlewareOptions( string Value );
public class CustomMiddleware : ICrawlerMiddleware<CustomMiddlewareOptions>
{
private readonly CustomMiddlewareOptions options;
// Accept options as ctor dependency
public CustomMiddleware( CustomMiddlewareOptions options )
=> this.options = options;
public Task InvokeAsync( CrawlUrlContext context, CrawlUrlDelegate next )
{
Console.WriteLine( $"Executing typed middleware with option '{options.Value}' while crawling {context.Url}" );
return next( context );
}
}
// ...
var options = CrawlerOptionsBuilder.CreateDefault()
.Use<CustomMiddleware, CustomMiddlewareOptions>( new( "Hello, World!" ) )
.Build();
await crawler.CrawlAsync( new Uri(...), options );
Delegate Middleware
The final method of implementing a Middleware is a "Delegate Middleware", which allows an inline delegate method to be used:
var options = CrawlerOptionsBuilder.CreateDefault()
.Use(
( CrawlUrlContext context, CrawlUrlDelegate next ) =>
{
Console.WriteLine( $"Executing delegate middleware while crawling {context.Url}" );
return next( context );
}
)
.Build();
await crawler.CrawlAsync( new Uri(...), options );
Delegate Middleware is especially useful for debugging & testing other Middleware in the crawl.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
-
net6.0
- Earl.Crawler.Abstractions (>= 0.0.0-alpha.0.111)
- Earl.Crawler.Events (>= 0.0.0-alpha.0.111)
- Earl.Crawler.Middleware.Abstractions (>= 0.0.0-alpha.0.111)
- Microsoft.Extensions.DependencyInjection (>= 6.0.0)
NuGet packages (2)
Showing the top 2 NuGet packages that depend on Earl.Crawler.Middleware:
Package | Downloads |
---|---|
Earl.Crawler.Middleware.UrlScraping
Earl Middleware for scarping and enqueuing urls via the Earl.Crawler.Middleware.Html.Abstractions.IHtmlDocumentFeature when crawling a url. |
|
Earl.Crawler
Earl is a suite of APIs for developing url crawlers & web scrapers driven by a middleware pattern similar to, and strongly influenced by, ASP.NET Core. |
GitHub repositories
This package is not used by any popular GitHub repositories.