Generally speaking, tablebases (and other engines) do not filter out illegal positions. Instead they filter out ill-formed positions, typically:
- Wrong number of kings for a side
- Pawns on first or last rank
- Player to move already delivering check
It's unclear from an orthodox perspective how to play forward from an ill-formed position, so it's right that such positions are excluded. The criteria are easy to apply.
In some engines, the stated reason for exclusion of an ill-formed position is that it is is “illegal”, but that is misleading. Many illegal positions are handled perfectly well by the engine. If an ill-formed position was not illegal, then players would have stumbled across it, and the rules would have been fixed to allow us to play forwards from it. It is exactly the ill-formedness that means the position cannot be covered by the engine.
Another source of confusion is that tablebases do use what they call "retrograde analysis" to iterate backwards from checkmate positions, to find out how they can be forced. But that is completely different from RA in determining whether a random position has an ancestor in the game array and hence is legal.
Reasons for not excluding illegal positions from engine consideration include:
- The FIDE Laws of Chess apply to illegal positions too. Although the notion of "illegal position" is defined in the Laws, it is never used (except in a single weird corner case in Rapid Chess I think). It's just defined so that arbiters can refer to it.
- It can be really hard to determine whether a position is legal. So much so that an entire genre of problem composition is built around it.
- Removing illegal positions doesn't make it easier to write the engine, whether it be a tablebase like Syzygy or more general engine like Stockfish.
- There are interesting positions which happen to be illegal.
- Chess problems, according to the Codex, can still be sound if the position is illegal.
- When composing a problem, it's sometimes a helpful approach to find an illegal diagram version first, and later to refine the diagram so that it's legal.
There are engines which obsessively disallow certain kinds of illegality (e.g. more than 8 White pawns, taking into account promoted material). As a composer, I find such behaviour inconvenient. Sure legality is nice and all, but we should be able to ignore it early in the design as there are harder challenges in composition.
The distinction between well-formed positions & legal positions parallels the distinction between well-formed formulae & valid formulae (theorems) in mathematical logic.