Alternatively look for an open-licensed dataset if one exists in your domain (e.g. using https://fairsharing.org or, shameless plug, https://biokeanos.com). With generated data you always add some assumptions, with more 'wild' data you have a chance to discover edge cases earlier.
Alternatively look for an open-licensed dataset if one exists in your domain (e.g. using https://fairsharing.org or, shameless plug, https://biokeanos.com). With generated data you always add some assumptions, with more 'wild' data you have a chance to discover edge cases earlier.