Teams/clusteradm/tiny-mirror

For mini-sites, we can probably do something like this.

Baseline specs:

One machine
reasonably modern cpu.
At least 96G ram; more would be better.
remote management capability (serial console or IPMI)
Hardware suitable to run zfs on (AHCI/HBA, *not raid*), aiming for 16TB+ for room to grow.
reliable network driver. eg: em / igb. Preferably not a re0.

Suitable drives structure:

SATA is perfectly fine as long as the machine has plenty of ram (Not raid cache!).
4 x 8TB sata drives on onboard AHCI controller (Between 16Tb to 20TB usable, many disks are preferable for improving IOPS.)
4+ x 8TB sata drives on onboard AHCI (16TB usable)
8 x 2-4TB sata or sas on something like a mps / mpr LSI card. Not mfi / MegaRAID.
If it has to be SAS, then something like an LSI SAS 2008 series HBA card would do.

Network:

Naked internet drop, or close to it. The host will be running its own stateful pf firewall.
IPv6. It's going to need to talk to our sync and C&C servers that are IPv6-only.
We can do IPv6 through a tunnel if needed, so long as its reliable. (XXX honestly though?)
Several available IPv4 addresses. We typically use a /28:
- host native presence (ssh etc, ntp replies)
- ftp service (port 21, 49152-65535)
- http pkg service (80, 443)
- git mirror (22, 443)
- web mirror (80, 443)
- dns resolver source address - validating dnssec resolver (full range udp, used randomly for queries on an otherwise empty address)
- geo-dns server (53 udp, 53 tcp)
A /64 or so of IPv6 space. All of the above are on IPv6 as well, plus command & control, mirroring systems, rsync etc.
This should be set up like we have on the firetruck systems. The subnets would be routed to the host, not a host on a subnet as that's difficult to firewall.
Ability to have matching forward and reverse dns.
how to name and number stuff, particularly ipv6: Teams/clusteradm/hostnames-and-ipaddresses

Remote admin capabilities:

Access to serial console / remote power. Either actual serial or IPMI.
a 16GB+ USB flash drive that has latest FreeBSD RELEASE memstick image on it.
The ability to OPTIONALLY remotely select the flash drive to boot.
On the occasion that we break it, the idea is we can use the console to boot it from the flash drive and fix whatever we did.

Security:

These things don't allow ssh passwords, ever, even for root. ssh keys only.
we want to be able to ssh to it.
Ideally, this sort of thing would be placed near or outside your border.
In the event of a security incident, it will be able to be wiped without drama.

Mirroring info:

We do synchronous atomic publishing across the entire mirror network.
The pkg system *requires* this if we do geo-dns routing.
This means package data is published according to the *slowest* mirror.
We can do multiple mirror source paths to find the fastest transfer rates (eg: we currently do us-west -> europe -> russia rather than us-west-> russia)
We mirror over ipv6. If your ipv6 transit is unreliable, we'll break it for sure.
If the mirror is significantly holding up publication of data we'll have to pull it from the geo-dns rotation.
We would like to operate a tighly held/monitored ftp / package / git / distcache / (perhaps www) as part of the main ftp.freebsd.org, pkg.freebsd.org, git.freebsd.org, www.freebsd.org etc geo-dns services. To be a part of that, we need close control.

Privacy:

We process and aggregate traffic logs for usage statistics for feedback to the project so we can direct efforts accordingly.
This does not give us the means to track users.

Size at time of writing:

ftp: 1.2TB (approximately 100GB of new data each week for snapshots etc.)
ports distfiles: 2.4TB (part of distcache)
local distfiles: 147GB (part of distcache)
pkg: 5.4TB (35-45GB per package set, for all the permutations. About 30-50% are replaced each week.)
Preliminary stats: 3x to 5x outbound amplification during routine updates, much more after a release.
Naturally, traffic doesn't create itself. More mirrors reduce the share each site gets. (This is about quality of life for local users, not so much about saving traffic at this point).

In summary:

1 machine, 16TB+ NON-RAID usable space for ZFS, 24G+ ram.
naked ipv4 and ipv6 with our own firewall.
approx 14x ipv4 (although we can squeeze it a bit, we will need more than 1). A /64 for ipv6. Reverse dns.
Ability to remotely admin/console/power/reboot/reinstall so we don't have to bother you.
have acceptable connectivity to another mirror to sync from and possibly fan out to others.
This isn't for "just an ftp mirror"; it is for participating in our primary front-line services. (download.freebsd.org, pkg.freebsd.org, git.freebsd.org, www.freebsd.org, docs.freebsd.org, etc)

Initial bootstrap:

since we're going to depend on not having netboot, initial bringup should be done as a remote cluster install to prove it can be done.
it will be slow, but we depend on this working, so don't skip it.
NEVER EVER EVER use sysinstall/bsdinstaller to bootstrap - we need no surprises when using internal tools.

# fetch -o /tmp/install.sh http://admin-http.freebsd.org/install/pxeroot/install.sh
# /tmp/install.sh -b 10 -h hostname.site -1 ada0 ada1 ada2 ada3

.. then the jexec to post install setup, ksu etc.