It feels like from your post, there may be a few facts that you're not aware of for implementing IEMs on stage. So I'll go through the basics in detail, I hope it's not repeating information you already know.
The way IEMs are implemented for a live stage performance:
- Each musician in the band are mic'd and routed through a sound board
- The IEMs are plugged into a battery powered wireless belt pack
- The sound board feeds the sound from each musician into a wireless transmitter, which sends the signal to the belt packs
- The sound engineer can adjust each of the mic feed from each of the musician and adjust their volume
- The musicians also have volume control on their belt pack just to control the overall volume.
So basically, all of your band and the crowd's noise is completely attenuated by the IEM. A good custom IEM will attenuate the sound by about 26db-ish, give or take depending on the material of the IEM. This gives you a very quiet, almost silent environment to start working from. That's how IEM protect musician's hearing, by blocking out everything and allow the musician and the sound engineers themselves to determine what sound is being let in.
The obvious issue here is, you are isolating the crowd almost completely. Most of the time the sound engineer will also mic the crowd, and feed the crowd through a reduced volume, so the musicians can feel some interactivity with the crowd and get a sense of the crowd reaction. This is very much something that takes a little bit of getting used to, some musicians never get used to the "lack of real crowd interaction sound" thing, and that's why you see some musicians perform with one IEM in their ear, and they will take the other side out to hear the "real crowd". The problem with this, is that you're prone to cause even more damage to one of your ear when you do this.
Your perception of sound coming in from both ear at the same volume, goes up by about 6 db. Meaning if you're hearing the same volume coming from the earphones, with both of them in, your brain combines the sound image and your perception of the sound is approximately 6db higher than it really is. To achieve the same loudness hearing from only one ear, you'll have to turn up the sound by about 6db. So you can imagine how much damage that is to your ear if you use only one IEM when performing on stage.
Sensaphonics actually makes an IEM called 3D Active Ambient, which has microphones built into each of the earphones to not only "mic the crowd", but basically create a binaural mic feedback loop from the perspective of the individual musician. So the musicians can adjust how much crowd/surrounding noise they want filter through from the custom belt-pack. Having the microphone on the earphone, creates feedback that feels more natural and realistic than simply pointing microphones at the crowd.
Overall, even if you were going to go with the cheapest solution, using only universal IEMs instead of custom, you still have to spend a fair amount of money on a soundboard (if you didn't have one already) and wireless transmitter/receiver packs. I don't know much about the quality of wireless transmitters for personal monitoring systems, but Shure does make a lot of them, and at least it's a good place to start looking.